Yah, I co-invented it at looker. Looker uses several different techniques to achieve these calculations. The cost is on the order of a count(distinct). Sorry you found it difficult.
Lloyd did a great video presentation of this article at Data Council in March 2023 - I found the accompanying visuals a great addition, link is here: https://www.youtube.com/watch?v=zmmJgwc3oPI
I've personally be interested in it for some time, but Cypher as a query language is such a pain to write that I've never really invested much time into it.
It *seems* like Malloy might be trying to solve a somewhat similar pain point in that it's reducing the duplication when querying data.
Perhaps they're not exactly comparable since graph databases are a completely different type of database, whereas Malloy is interacting with relational DBs in a different way than SQL. But was still curious if you explored Graph DBs & Cypher and had any opinions on it.
This is the same technique that Looker uses for all DBs, not just the ones that don't support CTEs.
Unfortunately, it trades what would be 2 O(N) operations for one O(N^2) operation, which is an unfortunate trade-off.
Additionally, I have found it extremely difficult to debug or tune queries written like this.
Yah, I co-invented it at looker. Looker uses several different techniques to achieve these calculations. The cost is on the order of a count(distinct). Sorry you found it difficult.
Lloyd did a great video presentation of this article at Data Council in March 2023 - I found the accompanying visuals a great addition, link is here: https://www.youtube.com/watch?v=zmmJgwc3oPI
Somewhat tangential, but I'm curious how much you have explored Graph databases and Cypher as an alternative to SQL as well? (https://neo4j.com/docs/getting-started/current/cypher-intro/)
I've personally be interested in it for some time, but Cypher as a query language is such a pain to write that I've never really invested much time into it.
It *seems* like Malloy might be trying to solve a somewhat similar pain point in that it's reducing the duplication when querying data.
Perhaps they're not exactly comparable since graph databases are a completely different type of database, whereas Malloy is interacting with relational DBs in a different way than SQL. But was still curious if you explored Graph DBs & Cypher and had any opinions on it.
Very interesting. I could see a lot of value here. How do you handle the various dialects?
The difference between dialects are abstracted. Most of the differences are in the way they handle nested data. We currently support BigQuery, postgres and DuckDB. You can see the code here. https://github.com/malloydata/malloy/blob/main/packages/malloy/src/dialect/dialect.ts