TSSJS Day 1: Google App Engine Datastore using RDBMS concepts

Google Datastore is transactional, natively-partitioned, hierarchical, schema-less and based on Bigtable.  It is NOT a relational database.  Benefits are you only pay for the services you need and someone else manages upgrades, redundancy, and connectivity since it is a hosted solution.

Terms to understand: Kind is the table, Key is the primary key, Entity Group is the partition and 0..n properties are the columns.  Soft schema constraints are enforced at the application layer — there is no traditional DB schema.  Google believes that the DB schema is really redundant to what JPA can provide.  @Entity really then defines the schema.

BigTable is just how it sounds…ONE BIG TABLE.  It prepends the primary key with the “Kind” (the table) which is always part of the primary key.  This keeps things together in Bigtable.  Using surrogate keys makes it a bit more complicated.  You can store this as part of the key /Person:18/Pet:Fido and the ID = 1.

Transactions are different with GDE.  RDMS relies on global transactions, but GDS uses local transactions and uses this in a particular entity (i.e., /Person:Ethel/Person:Jane but not /Person:Ethel/Address:Home).  This makes entity group selection important.  Too coarse of entity groups hurts throughput and can cause concurrent modification errors, but less limits utility to TX’s.  Should attempt to group entities together as one table where transaction boundaries line-up when committing information that is together.

Relationships are managed automatically.  However if you don’t like the way it does it, you can change it.  Joins are problematic when you need to do sorting or inequality filters.  So the net is that Bigtable scales linearly with the resultset but they don’t support joins too well…YET!  In the meantime, you can use a SELECT to filter on a denormalized field such as student and course.

Key takeways

Simplified devleopment and managment of your application.  Using JPA you can get the typical RDBMS features, and thus familiarity, that you are used to (pk, relationships, etc.).  Understand the difference between JPA-RDBMS vs. JPA-GDE — there are differences.  Easier to move apps off GDE if you need to — portability is supreme!

To me the problem is transactions.  If you need to update multiple entities (think cascade updates or inserts for a big form on a GUI) that are part of the same Kind, then you have to transact each entity.  If one fails, you need a compensating transaction to undue the previous transactions that succeeded.  This seems problematic to me.

Check out this for more info: http://gae-java-persistence.blogspot.com.

Advertisements