Is it possible to define articles in a publication and let the where clause refer to columns from a foreign table? For example I have these 2 tables: create or replace table employees ( empid int, withprovision int, empname varchar(10), departid int); I want to let sql remote replicate only the employees who make provision along with their orders, also I want to refer the subscription by from a foreign table. My scripts looks like this: create publication dba.mypub (table employees where withprovision = 1 subscribe by departid, table orders where exists (select 1 from employees where withprovision = 1 and orders.empid = employees.empid subscribe by ????); Table employees works exactly as I want (where clause and subscribe by do what I need). Table orders: the where clause acts as where 1 = 2 and I don't know how to write the subscribe by... Is this generally possible with SQL Remote? |
One approach is to use a join within your subscribe by (select e.departid from order_items oi inner join orders o on oi.order_id = o.id inner join employees e on o.empid = e.empid where e.withprovision = 1) In other words: You can use the same subscribe by condition for all dependent tables but have to join them to the table the subscribe by relates to. FWIW, this is also (and naturally in a better fashion) documented in the docs: |
A subscribe by subquery should do the trick here. CREATE TABLE cons.employees ( empid bigint NOT NULL DEFAULT GLOBAL AUTOINCREMENT(1000000), withprov integer NOT NULL, empname char(64) NOT NULL, deptid integer NOT NULL, PRIMARY KEY (empid) ); CREATE TABLE cons.orders ( orderid bigint NOT NULL DEFAULT GLOBAL AUTOINCREMENT(1000000), empid bigint NOT NULL references cons.employees, textcol char(64) NULL, PRIMARY KEY (orderid) ); CREATE PUBLICATION cons.p1 ( TABLE cons.employees where withprov=1 subscribe by deptid, TABLE cons.orders subscribe by ( select deptid ` from cons.employees where cons.employees.empid = cons.orders.empid and cons.employees.withprov = 1 ) ); With a subscribe by subquery, you now need to think about what happens when the parent table (employees in this case) is updated in such a way that will change what rows remote database will have in the child table (orders in this case). dbremote will take care of migrating rows on the employee table, because there is an operation in the transaction on that row, but there is no entry in the transaction log for the orders table. You need to write triggers with update publication statements that will insert or delete rows at remote sites on the orders table when you make changes to the withprov or deptid columns. CREATE TRIGGER bu_deptid_employees BEFORE UPDATE OF deptid ORDER 1 ON cons.employees REFERENCING OLD AS old_row NEW AS new_row FOR EACH ROW BEGIN UPDATE cons.orders PUBLICATION cons.p1 OLD SUBSCRIBE BY old_row.deptid NEW SUBSCRIBE BY new_row.deptid WHERE cons.orders.empid = new_row.empid AND new_row.deptid = 1; END; CREATE TRIGGER bu_withprov_employee BEFORE UPDATE OF withprov ORDER 2 ON cons.employees REFERENCING OLD AS old_row NEW AS new_row FOR EACH ROW BEGIN UPDATE cons.orders PUBLICATION cons.p1 OLD SUBSCRIBE BY NULL NEW SUBSCRIBE BY new_row.deptid WHERE cons.orders.empid = new_row.empid AND new_row.withprov = 1; UPDATE cons.orders PUBLICATION cons.p1 OLD SUBSCRIBE BY new_row.deptid NEW SUBSCRIBE BY NULL WHERE cons.orders.empid = new_row.empid AND new_row.withprov = 0; END; Disclaimer: These two triggers don't work well together when a single update changes both the withprov column and the deptid column. This should be a single trigger that ensures you don't insert or delete the same rows twice. I'll leave it as an exercise to the user to make a single trigger that does this properly. Reg |
Actually I have problem even with table EMPLOYEES.
This table is included in a publication in both sides (Cons. and Remote), which means the table replicates in both directions, the creation of publication looks like this:
The problem which I have is as follows (only in case of variant2).
I insert a new employee on rem., then dbremote replicates it to cons., and the next dbremote deletes the same inserted employee on rem.!! but, one moment this is what the log of dbremote says, however the delete statement on rem. will be actually rolled back so that at the end the inserted employee remains in both cons. and rem. (i.e. the end result is OK, but here many questions!!)
I do not understand why your cons has to different publications/variants for the same table.
How do you create the remotes?
When using the default tool DBXTRACT, publications on cons and remotes should be identical - with the exception of the SUBSCRIBE BY clause - which would be used to send updates from the cons to the remotes only for those rows satisfying the clause... (And you would use that SUBSCRIBE BY clause to "partition data" for remotes, not a WHERE clause. AFAIK, a WHERE clause is focussed on generally excluding/including rows for replication but not to direct whether rows are published to particular remotes or not.)
the mentioned two variants are not on the same time, they are just two different cases.
In my case I need the WHERE clause only on the cons., so the publication on rem. and cons. are not identical.
For the sake of simplicity I am posting here with the employees table as example.
Do you use DEFAULT GLOBAL AUTOINCREMENT (or GUIDs) for primary keys? Or could it happen there is a PK conflict when several remotes enter new data?
By default, SQL Remote doesn't send operations back to the origin database, unless there's an UPDATE conflict.
the PK in my case consists of 2 columns, one is sequential number (not auto increment but self calculated max+1) and the other is an identifier of the database.
Moreover, in my last test I was using only one rem., so there is no PK conflict.
When I embed my WHERE clause inside the SUBSCRIBE BY clause then I dont have the problem, something like this
I would very strongly suggest to use AUTOINCREMENTs here, unless you can absolutely make sure PKs are never re-used. PK re-use will usually happen when the maximum PK is deleted, so the next entry gets the same PK as the deleted row before... additionally, the max calculation also bears the risk of PK violations by two parallel select-max-and-insert transactions.
This sounds like it’s working as designed, with one exception that I’ll address later in my response (*).
First, let’s consider the following situation with the employee table, whose schema is exactly the same at the consolidated and remote, but whose publication definitions are slightly different.
SQL Remote definitions at Consolidated:
SQL Remote definitions at Remote:
Next, let’s insert two rows in the remote database.
When you run dbremote against rem1, both rows are sent to cons, since the publication at the remote simply says all rows on the employee table replicate. Dbremote runs on the consolidated, picks up the message from rem1, and happily inserts both rows. However, the row for employee ‘Reg’ has a value of withprov=0, and the publication definition in cons says that the rem1 remote database should only have rows where withprov=1, so it determines that rem1 should not have this row and sends a delete for the ‘Reg’ employee to rem1.
(*) The only thing I find odd about your post is that you say the delete on the employee table is rolled back when dbremote attempts to apply it at the remote database. I suspect this is because there is a foreign key relationship to the employee table at the remote, and there are child records associated with the row dbremote is trying to delete.
The key point here is that if a remote sends up a row to the consolidated and the consolidated determines via the where clause or subscribe by clause on the publication that the remote that sent the row should not have that row, a delete is sent back to the remote.
I’d be happy to be proven wrong, but I don’t believe this is related to how primary keys are defined at the consolidated and remote. Given the lack of information on why the delete may have been rolled back when sent down to the remote database, Volker and I both making educated guesses at what the problem might be.
Reg
I think Volker intended to say :
I would very strongly suggest to use GLOBAL AUTOINCREMENTs here, unless you can absolutely make sure PKs are never re-used.
While I certainly would very strongly suggest to use those GLOBAL AUTOINCREMENTs with SQL Remote, I had taken into account that Baron already seems to have a "database identifier", so I was only suggesting to alter the "sequential number" to be an always incrementing (non-global) AUTOINCREMENT at least, assuming changing from a composite two-field PK to a single PK would require to re-setup the complete SQL Remote setup...:
But I guess I'm not sure whether the (non-global) DEFAULT AUTOINCREMENT would work sufficiently when different values from different databases would be inserted - that might drive the next default increment way higher than necessary for the current database...
I'd missed that it was a two-column primary key. Sorry Volker.
No need to worry, I'm fine, I got the provision in your sample:)
Thank you Reg for the reply. Your assumption/explanation sounds very reasonable. The thing what you find odd is also clear to me, since my employees table has a child table (PK-FK relation), so that is not possible to delete the parent.
What I find odd: the phenomen of deleting rows on the rem. happens even if the where clause on cons. is 1=1.
I'll test it once more next week and write the results again her.
Thanks Volker, but there is definitely no PK conflict/reuse
Hello Reg, with your example my expression above ist not correct:
I don't know why in my example was deleted even with 1=1
Your example is correct, and I get the same response as you expected, the first employee will be deleted on rem. (as you said, it is according to the design), but I need the following:
Employees with withprov=1 should be replicated among all other remotes, but the employees with withprov = 0 may NOT be deleted on the publisher remote. In other words, each rem. should have all employees inserted locally(regardless whether withprov = 0 or withprov = 1), in addition to all employees withprov=1 (regardless whether inserted locally or on other remotes)!!
With something like this I could solve it:
A subscribe_by value of NULL or an empty string can be used (in confunction with the subscribe_by_remote option set to "On") to leave an entry at the remote it has been created at without publishing it to other remotes. This could do the trick here.
If all rows on the employee table in the consolidated database replicate to all remote database regardless of the value of the withprov column in the table, then IMHO, this column should not be referenced in any way in the publication, either in the where clause or the subscribe by clause. Just let the whole table replicate in both directions.
If the rule is that at the remote database you cannot delete a row in the employee table when withprov=0, then you should implement that in a before delete trigger, and throw an exception if you determine that there is a row in the SYSREMOTEUSER where consolidate='Y' (i.e. we are at a remote database).
It sounds like you are trying to solve an issue with publication design that should be solved using another manner.
Reg
I would need to see the entire article clause in the publication for the employee table, as well as the data that was sent down as a delete to answer this question.
In my (limited) understanding of Baron's description, it's different:
(Of course I do not know whether employees might change from the first to the second group or vice versa.)
Therefore I still do think this should be handled by publication design but the with an empty subscription value for the 2. group.
(Of course, in case those 2nd group employees should not be replicated to the cons at all, a publication with a WHERE clause like "keep_local = 0" would suffice if those employees are marked with "keep_local = 1".
No, I didnt say that all employees should replicate to all remotes.
Maybe I couldnt explain it well, but I try with other words.
The employees are inserted only on the remotes, and there are two types of employees (with and without provision).
All employees (with and without provision) should then be collected on the cons., later the cons. sends some employees (only those with provision) to all remotes.*
So at the end the cons. has all employees, each remote has its own employees in addition to employees with provision from other remotes*.
This is the first phase, as the second phase, even employees with provision should not replicate to all remotes *, as the remotes are grouped into different departments, so it is more preciesly so:
The cons. has all employees, each remote has its own employees in addition to employees with provision from other remotes within the same department
With your hints I could solve it, and my script looks something like this:
Hm, but then again according to Reg's explanation the cons would have to delete those employees with withprov = 0 on the remotes they were created on (unless a FK violation does prevent that there, but that would be bad design IMHO).
I guess a subscriby by clause that really expresses your (very reasonable!) business logic would look something like this - assuming the employees table has a column createdAt identifying the according remote, say being filled with DEFAULT CURRENT PUBLISHER:
The actual condition would relate how you can identify the according remote - the sample just might give you an idea.
Or, as suggested above, you would leave the deptid null for employees withprov = 0 because then they would be replicated to the cons but not to other remotes. Personally, I would prefer the logic to be expressed in the subscribe by, in my mind the "subcribe by NULL" behaviour is rather obscure...
logically yes, but it does not, not because of FK violation, but it does not try to delete them at all.
apparently it makes a difference whether the condition withprov=1 is written in the where clause or in the subscribe by clause.
Hm, I don't think so. (*)
According to your current SUBSCRIBE BY clause, I would think it will make a difference if there are any employees in a particular department with withprov = 1, it doesn't matter whether the current employee has withprov = 1 or not. If there is at least one such employee in the according department, all employees of that department will be replicated to all remotes, otherwise none - and in the latter case the cons would send a delete to the origin remote.
That's at least my understanding of that condition. Of course checking the transaction log with DBTRAN -sr will help to understand the distribution.
(*) I'm relating to the actual effect. I surely do agree that it's a different whether a condition is put in a WHERE or SUBSCRIBE BY clause of an article...
there was a misstyping in the above script, it should be like this:
sorry Volker, there was a misstyping in my script, I corrected it above.
Well, I didn't notice that typo (and assumed it the way you have now corrected to, i.e. with the join condition "with employees.deptid = departments.deptid").
Therefore my thoughts still apply that this condition probably does not work as you seem to expect.
So my scripts look exactly so:
At the end I have following results:
AND, there is no any delete statement in the log files of DBREMOTE on any of the remotes! --OK
Ah, I see, I finally understand that your subscribe_by clause really does what I was suggesting: As the condition returns no value for all rows with withprov=0, it's the case of a NULL or an empty subscribe_by value, and as such the subscribe_by_remote option will control what has to be done at the remote.
You might try to set that option to "Off" to see if then employees with withprov=0 are deleted at the remotes (unless a FK violation prevents that).
@Volker Barth
I think here the current row in employee will be evaluated.
I think the only way to enforce the delete of an employee on its origin DB is to insert an employee with an incorrect deptid (what in our case will never happen), for example: In this case this employee will be "moved" (cut+paste) from remote1 (deptid=1) to remote3(deptid=2)in my case this will not happen at all.
Anyway, when I delete 30001 on remote3 then it will be deleted on cons. too, nothing more -- this is also OK
But the condition was written intentionally because I want to prevent the cons. of replicating employees without provision.
In other words, in case withprov=0, then the employee should be maintained (and not deleted) in its origin DB and also should exist in cons.
@Volker Barth How can I find an explanation what does the mentioned option subscribe_by_remote?
These both statements make big difference, despite at the first look they look very similar.
The cons. ensures that each remote has only the following employees (withprov = 1 && belong to the same department), it takes also care to prevent remotes maintaining employees who withprov <> 1.
It deletes explicitely each employee on its origin DB if it violates the where clause (withprov <> 1).
The cons. sends to each remote the following employees (withprov = 1 and belong to the same department).The explicit delete of an employee on its origin DB happens only if the cons. detects a violation in subscribe by (if the employee belongs another department).
As stated somewhere in a comment above, see the link:
Also see the doc on that mentioned option.
In my understanding, if you would set the option to OFF, SQL Remote would delete those rows on the remote where the subscribe_by clause is empty, and that would apply to those with withprov=0. (And if this turns out to be true, I would recommend to document the significance of that option for your setup...)
Thanks for the pointer, that was my obvious misunderstanding here: I had thought that the SUBSCRIBE BY clause would be evaluated for the whole table as such, not for a particular row. But apparently I was wrong. (And gladly my own SQL Remote setups had not been dependent on that difference ;) )