0Day Forums
Cassandra Schema Management with CQL scripts - Printable Version

+- 0Day Forums (https://0day.red)
+-- Forum: Coding (https://0day.red/Forum-Coding)
+--- Forum: Database (https://0day.red/Forum-Database)
+---- Forum: Cassandra (https://0day.red/Forum-Cassandra)
+---- Thread: Cassandra Schema Management with CQL scripts (/Thread-Cassandra-Schema-Management-with-CQL-scripts)



Cassandra Schema Management with CQL scripts - unlives86894 - 07-30-2023

To Cassandra Experts: I am tasked with a work of coming up with recommendations on Cassandra CQL script management and deployment. How teams manage (should manage) large number of CQL scripts (schema definition scripts (DDL), data manipulation scripts (INSERT/UPDATE/DELETE) from the inception of Cassandra development and through subsequent changes to the application schema model. If I may, I would like to point out that the development team size is not that small (10+ developers per application functionality area).

One way (probably the wrong way) is to do what a typical relational database shop would do: app developers or development dbas design and create ddl, dml et., scripts, store and maintain them in version control system (e.g. SVN), and deploy the scripts in an environment (dev, qc, etc. ) using some automation (may be as simple as shell or perl script). I think where this breaks down in NoSQL solution such as Cassandra is the actors involved in these three steps.
1 - design and create CQL scripts - should this be done by DevOps (cassandra admins) or application developers?
(2) store and maintain them in SVN - should this be done similar to (1) above and (3) deployment of scripts - should someone from application development do this (or) DevOps do this?
I would also like to get answers from application schema control and auditing viewpoint. For example, for #1 and #2 above, if application developers design, create and store the CQL scripts in SVN, how can one be able to control what gets into the CQL schema and prevent costly errors. If there is dedicated, single team owning the data model rather than all cassandra developers (akin to DBA/Administrators), it is easier to achieve that control.

I am hoping someone those who have done this before could shed some insight into the choices and best practices for CQL code development, deployment and maintenance in a large environment.
Thanks as always.



RE: Cassandra Schema Management with CQL scripts - jeslynfgq - 07-30-2023

I think the main issue you'll be confronted with is that you'll need to write code to perform some migrations, which is a significant difference to applying delta patches in a typical SQL scenario. Basic changes to the schema (as defined using CQL) can easily be applied using the `cqlsh` tool in a DevOps/DBA style. These types of changes would include adding columns and removing columns. But if you need to do something more fundamental, then you're going to have to write CQL client code to migrate the old data. This is especially true the more denormalization and non-declarative indexing your app requires.

FWIW and YMMV I was able to automate one aspect of CQL schema management which was to find a way to keep the schema and the application code in sync. To achieve this I wrote a [CQL schema compiler][1] that generates boilerplate application source code so that data binding is always in sync with the current schema in Cassandra. But that is just one aspect of the overall problem.

[1]:

[To see links please register here]