What's the best way to store binaries in the database with the current Sybase version 16?
Regards, Frank asked 05 Dec '14, 08:28 Frank Vestjens |
I have used option 1 when it was the only choice. LOTS OF HEADACHES accompany this method: files get deleted, renamed, moved and the links are useless. Database migrations to upgraded servers, backups and restores, get out of sync. Etc. I have used option 2 for several years now with SQLA and HIGHLY RECOMMEND this approach. SQLA handles large blobs very efficiently and you have a reliable ACID data set. You may say the large db and log files are a downside, but... you ARE going to have that space used somewhere, so why not use it for the database file and save MANY other headaches/maintenance issues. We have many gigabytes of blobs in our databases and would not change our approach. answered 05 Dec '14, 11:35 Bill Aumen |
I'm not aware of any extensions related to blobs in v16, and I could not even imagine how a database could "manage" externally stored data ACID-compatible as if it was stored internally - the database engine cannot fully control the contents of external files... Some topics that might be related:
(As a different topic: LOAD TABLE has had several enhancements in newer versions w.r.t. to logging "external data" - here as the source data - to make it recoverable but I guess that's not the issue here...) Having said that, I guess the questioned "best way" really depends on your personal trade-off between your points 1 and 2. answered 05 Dec '14, 09:28 Volker Barth |
We offer our users both Option 2 (the default) and Option 1. Only our larger users tend to go for Option 1. We manage all access to the blobs via the database server - absolutely no direct client access. Which of the two methods is in use is transparent to applications as all access is via the same database stored procedures. This method of controlling access seems to avoid the missing / moved files issues that Bill mentions as only the database server has access to the files. We compress all usefully compressible blobs (ie not jpeg, pdf etc) at the client end and store the compressed version - this seems faster and minimises bandwidth. We've found little difference in performance between the two methods. We haven't used any of the new file handling options in v16. If clients chose the external route they of course have to deal with the backup / synchronisation issues you mention. answered 05 Dec '14, 11:56 Justin Willey 1
Wow, that seems a very nice solution!
Do you somehow prevent "local" access to these files/directories, say by particular OS-based permissions?
Just as an idea: As you already let users choose between both options and handle that in a transparent fashion, it might also be possible to extend the backup/restore method (via stored procedures or as events following a backup) to synchronize the external files, too, say with tools that allow a "mirror-wise copy" like MS's robocopy... - That would already come close to option 3. - Just my two cents, apparently:) (Note, I'm saying that based on experiences with a third-party tool that uses option 1 and leads to all these problems mentioned by Bill... - I'd be glad if the vendor would have incorporated at least parts of your efforts!)
(06 Dec '14, 06:00)
Volker Barth
Replies hidden
1
Most of our customer have dedicated database servers with no client access (at a file level) to the server anyway - so it isn't normally a problem. I like the idea of extending the back-up process. However I think all our implementations of Option 1 are on SANs and use some sort of built-in mirrorring capability of the SAN to deal with the blob files. While for most purposes I prefer Option 2 for its simplicity and reliability, the advantage of a much smaller database when it comes to shipping it about (eg for investigating a performance problem) can be nice. Detachable DBSPACES would also allow that though - a long term wish list item of mine.
(08 Dec '14, 06:18)
Justin Willey
1
Oh yes!
(08 Dec '14, 06:22)
Volker Barth
|
Option 2 (store the blobs in the database) is the best method for all the reasons other people present. If you prefer to store your blobs on a separate drive for performance reasons, you can store the blobs in separate tables and assign those tables to a separate DBSPACE. That might help if you have certain processes that perform heavy I/O on the blobs... a separate drive keeps that I/O separate from other data. Think carefully before creating multiple DBSPACE files, however; their existence makes administration more complex. Another reason to store blobs in separate tables is MobiLink synchronization. MobiLink synchronizes whole rows, so if the blob columns don't change much then storing them in separate tables reduces the amount of unnecessary synchronization traffic. answered 07 Dec '14, 12:34 Breck Carter I'd think the one critical trade-off might be the time required for full database backup and restore, in case the blobs lead to many GB of data and the blobs themselves are rarely modified. Then even several dbspaces won't speed that up significantly, methinks.
(07 Dec '14, 12:58)
Volker Barth
|
I have already been late in this thread. I do not have the experience in the option 3, but I am voting for the first option. The same is a common practice in many answers on SO: http://stackoverflow.com/questions/8952/storing-a-file-in-a-database-as-opposed-to-the-file-system
Regarding file changes, movings etc. I prefer have the file system emulation on the DB side (e.g. folder/file names), but store references to files only.
Usually people lose these references when they "play" with the file system directly. And it is not a problem of this practice, rather than of the preoject's implementation.
Imagine, in future someone would like to store large documents/videos and songs in the DB. How should one scale such solution?