Off topic: IMHO, everything that's been happening over the past few years is a self-fulfilling prophecy in no small part due to attitudes like this. Der Fuhrer did not have to put in much effort to convince the population when even those exposed to the outside world have met with enough suspicion and contempt to "know" (whether it's true or not) that most westerners have never seen us as equals, or even any sort of positive force.
Most probably don't even realize it. I see it as something similar to what racial minorities in the US go through: ask a random stranger on the street if he's racist, and he will honestly say no, even if he actually simply does not realize it, while it deeply affects how he sees the world.
I've also been seeing similar attitudes in relation to the Chinese. People avoiding excellent projects because they were written by some Chinese guy, including things where supply chain security is of no concern. Again apparently not realizing that these days a large part of the work on the Linux kernel is committed by paid employees of several large Chinese companies, all of them tightly intertwined with the government. Forget talking about who is building the hardware we all use.
Whatever, the internet is fracturing and balkanizing at full speed anyway, and the borders are slowly closing. Won't be long before we won't be able exchange anything non-destructive anymore. It was good while it lasted.
>ask a random stranger on the street if he's racist, and he will honestly say no, even if he actually simply does not realize it
My lord you people are beyond patronizing.
When people refer to "the Chinese" or "the Russians", we are taking about the nation state, not the people. And there are legitimate security concerns. Whether we should be adversial is another question. But we are.
I am wary of any supply chain attack and more so if the project is maintained by people with relationships in adversarial countries. The risk of exploitation outweighs the convenience.
Given that american ignorance is a cultural thing (with many people deliberately electing the way grandpa did it) is it not kind of racist to generalize americans as unknowingly racist?
You said, "ask a random stranger...and he will honestly say no" not "ask a random stanger...and he will probably honestly say no".
Most of most people are racist, it's just different groups. Americans obviously have less distrust of americans, but then I am just as certain that there are many many humans who would proudly share their "dumb american" stories as if that is not every bit as prejudicial to those of us who do not fit the description as any other "weak french" or "commie russian" or "sister fucking indian" or whatever else.
Were you using it for simple grep search or actually required advanced searching for eg: BM25. Clickhouse will only help you with grep like search from what I understand.
ClickHouse recently has been a breath of fresh air compared to using timescaledb for a long time. Although psql is the greatest there is and I really enjoyed the fact that I could rely on a single database system to run everything, when it came to migration maintenance and deployment it's really a pain and it also feels like development on timescaledb is a bit wishy washy with all the structural changes from version to version and it really feels like an alpha product sometimes.
I was using TimescaleDB some very long time ago, things have changed quite a lot since (it's now even named differently).
In my current setup I was thinking on doing both: upgrading postgresql to timescaledb (to archive old data etc.), and to deploy ClickHouse in parallel. I'm still considering whether to go big on PeerDB to get ClickHouse mirror or just deploy it separately without additional fragility layer.
Would you not recommend using timescaledb at all? I definitely want to avoid alpha-quality software pain, since PostgreSQL is one of the most rock-solid parts of the stack at the moment.
Worked on peerdb. If you're able to batch changes on your end & push to both postgres & clickhouse, do that. Only move to peerdb when you know you need cdc
Still via Grafana. I ran it side-by-side with Loki and despite trying to optimise Loki and using ClickHouse out of the box - it really was shocking how much faster ClickHouse was for every single query (e.g. in the last 12 hours give my the frequency of logs with a particular JSON event or even "find this log entry, then join back and find the number of times a different entry appears within the same correlation_id)
Not really, ClickHouse is super forgiving so you can do something like:
CREATE TABLE default.events (
`timestamp` DateTime
`event` String -- e.g. 'product.updated' or empty/null
`message` -- human readable message
`raw` -- the raw message - this is very useful when pushing logs that aren't JSON - you just let the `event` be null and dump the entire message here
)
ENGINE = MergeTree
PARTITION BY toDate(timestamp)
ORDER BY (timestamp, event)
TTL timestamp + toIntervalMonth(6)
ClickHouse is extremely performant even in the cases of e.g.: SELECT count(*) FROM `events` WHERE `raw` LIKE '%hello world%'
Of course, the more columns you splat out (e.g. like correlation_id, user_id, order_id, etc) the better you can index and expect those queries to perform but in general I don't bother outside the obvious core domain ones (exampled above), the performance is so good that unindexed queries are significantly faster than indexed queries in Loki. I have reached the point where I JSON extract on-the-fly for the WHERE clause with very large queries with no meaningful performance issues.
> You can open a pull request as an experiment, without aiming for it to be merged - it will be tested with the same level of scrutiny as production releases. Found a new memory allocator, a new compression library, a new hash table, a data format, or a sorting algorithm? - bring it to ClickHouse, and it will expose it inside-out
ClickHouse dev here, but this is true. ClickHouse contributed finding several bugs on our third-party libs (jemalloc, librdkafka for 100%, there much more, but I only worked on these), in linux kernel and basically everywhere. We have very rigorous fuzzers (yes, multiple fuzzers on multiple levels), running tests in insane number of configurations. I think the last number I heard a year ago is around 400 hours for a complete CI run for a single commit (not PR, but commit). So yeah, pretty insane, in the good way.
If your data is too big for postgres, it seems like moving straight to Clickhouse is the best option. We have been through an whole array of distributed database technologies, and Clickhouse might be first one that doesn't have too many compromises.
Clickhouse has been a game changer for some of the companies i have worked in the past. This reminds me of this podcast episode (1) from the Rust in Production pod about their Rust adoption.
Same. We replicated some data from Postgres, it was easy to set up, similar enough that the transition was trivial, and really good performance out of the box. One of those good "use the right tool for the job" experiences.
Clickhouse is *really* gatekeeping the "zero copy replication" where you store data on object-storage and have high availability from the open source version.
I think that is just the nature of the open core business - but like most such businesses, they're not very clear about how that is what they are, pretending to be open source business instead.
The query speed deserves the praise, but the JSON ingestion path has quiet footguns nobody mentions here. Every numeric column comes back as a string over JSONEachRow, so a forgotten Number() cast silently turns arithmetic into string concatenation, and with input_format_skip_unknown_fields enabled a single typo in a column name drops that field with no error at all. Worth wiring an assertion that inserts a row and reads it back into CI before trusting the dashboards.
Managers rejected it because it wasn't well known and was seen as "some database made by Russians."
On a personal level, it's quite sad to have seen that train coming so early and not been able to get on board.
reply