How do we maintain multiuser controls across a "globally distributed database"?

Hello,

I am running a SQL Anywhere v17 server (actually an HA collection of servers managing multiple databases but to keep things "simple" assume we have only one database) on AWS. Access to the database needs to be available literally from any location across the planet. AWS makes available multiple Regions across the planet in which redundant servers could host the servers and database and this reduces access latency. While I have no detailed knowledge of them, I know that SQL Anywhere v17 offers synch capabilities so that all database copies in these various Regions could be kept "in synch" within a "short" period of time.

However, I have what I think is a very difficult ("unsolvable"?) architectural problem. The clients accessing the database provide "multiuser" access and controls. This has to be done to control simultaneous access when clients all need access to the "same set of data". For example, using the classic textbook example of an inventory system, multiple clients can each fetch the inventory for an item, some with the intent of reducing the item's stock count due to a sale and others with the intent of increasing the stock count due to the arrival of more items. The clients must be built so that the 1st client to access the inventory row "locks it" forcing all others into a read-only mode. All of these types of control are currently working (on our real data not the example inventory data). My question is not about how to manage multiuser access but, given that we must implement multiuser controls, how can I "distribute" a database to multiple locations but still maintain the multiuser controls?

"Simple" database replication obviously won't work here and I suspect that there is no "Sybase supplied" functionality that solves this problem. I am expecting to hear that the only solution is to build some sort of "very fast" messaging system in which we have to manage access that we manually coordinate between database systems. This has to be a "common" (or at least there is some number of others who have the same need) problem.

I'm very interested in everyone's comments and suggestions.

Thank you.

multiple-servers

asked 14 Jul '17, 14:35

AlK
735●31●35●54
accept rate: 37%

I do not know the answer, but I have the impression such a very fast syncing distributed system might be way more complex and have higher latency than a solution with one database per client (which therefore can use the usual builtin locking facilities), even that may not be located "near" the client...

(14 Jul '17, 14:51) Volker Barth

Replies hidden

Hi Volker,

Thanks for your reply but I'm afraid I don't understand the part about "a solution with one database per client". If I take what you say literally, using my simple example about the inventory control system, would that not mean that each client would be talking to a database that is only accessed by that client? If that's the case, of course, there would really be no inventory control and I just don't understand how that would work. Therefore, I suspect you are suggesting something a bit different that I don't understand. Please clarify what you meant, if you would. Thank you.

(14 Jul '17, 16:22) AlK

No, I just think that a non-distributed database located in one region might be easier and faster to access than a distributed/synced one.

(14 Jul '17, 16:46) Volker Barth

I assume your underlying question is this: Does any DBMS product offer a general solution to calculating real-time inventory levels across a distributed database setup?

AFAIK the answer is no, not at the DBMS level.

The problem is monstrously difficult, as a simple Google search indicates:

how does amazon calculate real-time inventory levels

Keep searching, and the problem reveals more and more difficulty, and stories of the billion$ Amazon has spent to develop application solutions.

You're not Amazon, of course, but the problem is still difficult, and it is beyond the capability of DBMS software alone, with or without synching software.

Do you want to solve the general problem, or develop a solution to your particular business case? (with the emphasis on "business case", not technical features).

(15 Jul '17, 05:45) Breck Carter

Replies hidden

@AIK: I was going to delete my comment because it might seem unnecessarily harsh and discouraging, but then someone went and upvoted it :)

Aside from writing "RDBMS" instead of "DBMS", what I really should have said is this:

AFAIK SQL Anywhere, MobiLink and SQL Remote do not directly answer your question "How do I implement cross-database row locks", nor does it offer direct support for calculating real-time inventory levels across distributed databases.

...except perhaps for this: Support for Three-tier computing and distributed transactions is offered by SQL Anywhere, independent from MobiLink and SQL Remote.

Also, HA may be a complicating factor. Users can't get update locks on rows in the secondary database or any of the read-only scale-out copies because they are, well, read-only as a whole. However, users can get schema locks on the secondary database, simply be doing a SELECT, and those locks can block some incoming operations from the primary (e.g., LOAD TABLE requires an exclusive table lock)... when that happens, the user's operation on the secondary gets a funky exception because of the rule "Primary Is King!"

...which may just be a red herring for you :)

(15 Jul '17, 08:03) Breck Carter

@AIK: This is off-topic, but it sounds like you could answer this question: Is it possible to licence and run SQL Anywhere on Amazon AWS?...

...please! :)

(15 Jul '17, 08:10) Breck Carter

Replies hidden

Let me try to re-state my point:

In my understanding, you want to improve access time resp. decrease latency by providing distributed copies of the same database to your world wide-spread users.

Now, if the distribution requires serialization of these databases because of multi-user control on a transaction scope, it means these distributed databases must communicate before the transaction can be committed. Basically, that would mean the 2PC protocol Breck has mentioned.

So I would think that this access in the background between several databases (located in different regions of the world) would probably mean way more communication overhead and worse latency compared to a solution with a non-distributed database, even when the latter would necessarily be located in one place and therefore not opimally reachable for all users.

I guess this would hold both for optimistic and pessimistic locking strategies.

However, if the application itself is running in the cloud, it might be helpful to distribute that but use one HA database.

(15 Jul '17, 08:22) Volker Barth

What about Cassandra DB (or other no-SQL DBMS)? It offers the distribution of data, but it has also own limitations …

(15 Jul '17, 09:01) Vlad

Replies hidden

Have not used any of those No-SQL systems but AFAIK, they usually do not really guarantee ACID principles, see here for Cassandra's tunable consistency levels in favour of throughput, so I guess they might not offer the desired "multi-user control".

(17 Jul '17, 03:00) Volker Barth

This is the trade-in price that people should pay if they have the distributed DB around the globe. I was thinking about this question for some time, but I do not have a good solution. That is why I have proposed the distributed DBMS that could act at least as a single piece of software. E.g. there are several things that I "see".

One wants to have SAs around the globe, so the users will be able to connect to the closest instance and use it. What difficulties we have? Data replication (and thus performance) and data consistency. User management in theory can be implemented with a custom application/scripts or by delegation to a dedicated authorization component (LDAP or whatever). Normal databases do not like conditions when pings are long and the replication is slow. This solution is quite unreliable, because you have to manage independent DB instances, have something to manage users, load balancing and something else.

No-SQL DBs have a different concept (I don't say better, I say different). It is easier to have the data sharding, there are many ready-to-use solutions with the distributed concept. But again, there is a price for it - you have to have the redundant data and there is a quorum, where some nodes decide that this particular piece of data is replicated to at least 2-3 nodes. Speed is guaranteed, but the replication is delayed in time. But this software behaves as a single component and at least (I hope :) ) it is easier to administrate it or scale it up.

Usually these kind of solutions are not used alone. E.g. Cassandra has fast inserts, but slow reads and people add an additional DB to it.

Well... I do not know the correct answer, but I do not think there is a silver bullet for the question.

Update: oh! if I was thinking too deep, but the task is simple and the number of databases is ... low.. two-three, then I do not think that my comment has been relevant at all :) Feel free to delete it.

(17 Jul '17, 04:04) Vlad

/> Feel free to delete it.

Deleting other people's comments is not acceptable behavior on this forum. Thank you.

(17 Jul '17, 07:02) Breck Carter

You shouldn't blame a person, because he accidentally misclicked :(
But I like the way how you read my comments and reply to them.

(17 Jul '17, 09:47) Vlad

Thanks to all for this discussion. It reinforces my initial suspicion that this is as hard as I thought and that there certainly is no off the shelf solution.

(26 Jul '17, 19:14) AlK

Hi Breck, Based on conversations several months ago with the HA Product Manager there is no license problem running on AWS. I suspect, and certainly in our case, the license that makes sense is a Core based one since it may be (probably is in the cloud) unlikely one can specify a specific number of concurrent clients. Then the challenge becomes determining what "level" of AWS compute resource is needed and matching that to the # Cores in the License.

(26 Jul '17, 19:17) AlK

showing 5 of 14 show all flat view

Be the first one to answer this question!

toggle preview

community wiki:

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

*italic* or _italic_
**bold** or __bold__
link:[text](http://url.com/ "title")
image?![alt text](/path/img.jpg "title")
numbered list: 1. Foo 2. Bar
to add a line break simply add two spaces to where you would like the new line to be.
basic HTML tags are also supported

learn more about Markdown

Question tags:

multiple-servers ×7

question asked: 14 Jul '17, 14:35

question was seen: 1,344 times

last updated: 26 Jul '17, 19:17

SAP SQL Anywhere

How do we maintain multiuser controls across a "globally distributed database"?

Follow this question

Related questions