Permalink
Post by h***@gmail.com
In your reply to my question about that 220%, you mentioned points 2 and 3 as above, plus point 1 but with a 20% estimate in that reply (you later changed that to 10%). You never told me how those 20, 100, and 100 numbers had to be combined to arrive at your claimed 220% overall overhead.
Apologies for this assumption. And, forgive my ignorance, I have to ask now - *how* do we get an overall 220% overhead from 10% on the subtype tables, 100% on the subtype indexes, and 100% on the supertype indexes?
Post by d***@gmail.com
1. 10% additional overhead per Subtype (since it has only two columns)
2. 100% additional indexing overhead per Subtype (since it has only one index)
3. 100% additional indexing overhead (since it has only one index)
... two [indices] minus one [index divided] by one [index equals] 100% ...
100% plus 100% plus 10 equals 210%
If there were 9 subtypes that would be 900% *additional* overhead for the *additional* indices for the subtypes; plus 100% *additional* overhead for the *additional* index for the supertype; plus some *additional* overhead for the *redundant* column in each subtype.
I don't know how to be more specific, other than providing DDL again and counting the columns and the indices. Most IT people I know can figure it out from the above.
I have already pointed out that you are assuming the 220% is *overall* overhead, and that I meant that as *additional* overhead, not *overall* overhead. I don't know if you missed that or if you are hung up about the overall overhead (which you detailed, but which I consider irrelevant to the issue because it is platform specific; and the *additional* overhead is relevant since we are comparing the two constructs at the logical level).
Maybe go over the posts again, with those reminders in mind. Please do not rush.
Please try to focus on the issue and the argument, and not on one or the other platform. I did not discuss platform-specific performance (of either submission), only logical constructs. I responded to some of your platform-specific issues, and in doing so, I might have given you the impression that they are relevant. They are not relevant to the logical consideration of the issues raised in the thread. They are relevant to platform-specific implementation, only.
Post by h***@gmail.com
Since adding happened to result in just the right number, I assumed that you had done that -which, of course, would violate logic and mathemetical rules.
No idea what you mean. Please re-state.
Post by h***@gmail.com
Post by d***@gmail.com
(In the genuine Relational world, we have implemented this Subtype from the outset. It is identified in the Relational Model, over 40 years ago, and it was fully catered for in the first publication for Standard for modelling Relational databases, which was 35 years ago. The difference is, until 2007 when Functions arrived, the constraints were implemented in the [required anyway] transactions. Since 2007, we can implement Functions, and implement Declarative CHECK Constraints for the Subtypes.)
And even before that, we could implement declarative CHECK constraints using subqueries, in products that implement the feature (SQL:2003, Feature F671 "Subqueries in CHECK constraints"). Or we could even directly declare the subtable hierarchy, again in products that implement the feature (SQL:2003, Feature S081 "Subtables"). These two options are far superior to both the method based on redundant columns as described by Joe in this topic and your function-based method.
Marvellous. Feel free to post a full submission.
In the meantime, in this thread, we have two submissions are being compared. Probably not a good idea to keep introducing obtuse issues. Better to stick to the thread.
Post by h***@gmail.com
Post by d***@gmail.com
2. Sure, the syntax is different. So what ? The syntax for every SQL statement is different, particularly in the area of (a) extensions and (b) features that each vendor implemented before the standard syntax was published. So what ?
So people who know that their code might be ported, or who for other reasons prefer to stay as close to the standard as possible, try to use ANSI features as much as possible, and non-standard features as little as possible. The method posted by Joe, that you criticize, uses CHECK and FOREIGN KEY constraints, directly portable across most (if not all) major RDBMS platforms. Your version requires much more work to port.
You consider changing one constraint definition "work", do you ? Ok, fine.
Use Joe's method with the 110% *additional* overhead per subtype, and 100% *additional* overhead per supertype. And stop discussing logical performance or platform-specifc performance, because if you do, you will be proving yourself to be hypocritical or self-contradictory. Choose portability over performance, or performance over portability, and stick to it.
But if you keep both considerations active and arguing for space, that will leave you in a state of constant activity, without resolution. Going backwards and forwards at the same time. Lots of power and steam, but no progress in any direction. No decision, no commitment to anything. It is fine with me, if you want to do that. But please, you don't need to tell me about it, and you don't need to post it on a public newsgroup.
I have commitments, I stand for something. Therefore making a decision is easy. I spend more time moving, than I do talking about moving, or deciding to move, or thinking about deciding to move, or talking about thinking about deciding to move.
Post by h***@gmail.com
That doesn't make it a bad solution. As I said before - it is a good solution, but it's not the be-all-end-all. There are situations where I might choose your solution over Joe's, but there are also many situations where I'd prefer Joe's.
Fine with me.
Post by h***@gmail.com
Post by d***@gmail.com
What is relevant, is that Functions (tables, datatypes, columns, everything in a database is User Defined, we do not have to keep telling ourselves that fact) are now Standard. And that you can use a function anywhere that you can use a column. So as long as the SQL platform does that (provide the standard requirement), you can implement standard Subtypes *declaratively* on that platform (and non-declaratively since whenever the platform was born). The syntax is irrelevant.
Before we go off on a debate on what is and what isn't declarative, maybe we should agree on the same definition?
The definition I have always heard and used so far is that declarative code only describes the intended effect without detailing the steps to achieve it, and that the alternative is procedural code that describes multiple steps that have to be taken in the specified order. By that definition, stored procedures, triggers, and functions (*) are all procedural, since they can have multiple statements in their body that have to be executed in order.
(*) One exception to this is a very specific kind of function that SQL Server supports, "inline table-valued function" - actually a misnomer, for it should have been called a parametrized view.
The reason why I care is that the nature of procedural code makes it much harder for the DBMS to fully understand what the result of the code is. And that, in turn, makes it extremely hard, maybe even impossible, to implement an *efficient* way to guarantee that, no matter what we do, the constraint is never violated. (The unefficient way is trivial - simply check all CHECK constraints for all rows in all tables after every transaction (deferred) or data modifying statement (immediate). This is -if I read it correctly- how SQL:2003 defines the effect of CHECK constraints.)
I hope all that is going to be relevant to the thread at some point.
I don't need long definitions or the considerations of procedural code, and I didn't realise that people are changing the meaning of words to such an extent that every time we have to have a technical conversation, we have to stop and check our definitions of technical terms. It is sickening what MS have done to the industry. I stick to the original definitions of the words, and I don't change anything. If anyone needs some other definition, the honest, professional thing to do, is to create a new word. Only dishonest, subversive, insane people seek to change the meaning of established words. But then, MS has been dishonest, subversive, insane since 1984.
• Declarative means declarative
••• as identified by E F Codd in the Relational Model
••• specified in SQL, as DDL
••• not procedural
Now you understand what I stated before. There is no need to worry about all the other possible definitions or non-definitions or options or considerations.
Post by h***@gmail.com
Post by d***@gmail.com
Er, what "performance overhead" ? Please specify exactly, or withdraw the statement.
Executing the function, which involves evaluating a query, for every row inserted in each subtable, and for every row updated in each subtable with a statement that has the possiblity to update the primary key column.
The two ISO/IEC/ANSI SQL platforms that I specialise on, do nothing of the sort. As stated already, I have close to a hundred of these implemented. I have scores of performance tests and QP examinations, that give me confidence about it. I won't get into platform-specific performance issues.
My question was intended to remain in the logical performance area (you know, count the indices, think about what overhead the constraint and the function would have). Instead you give MS SQL performs like a dead bear. As already stated, and for reasons already given, I doubt it very much. But you are free to think that it performs this construct badly.
Post by h***@gmail.com
How much that will be, and how much more or less that will be than the performance overhead of Joe's version, depends on lots of factors. Chosen platform (and choices made by the engineering team of that vendor) probably being the biggest.
Good. So you understand that performance is a platform-specifc issue. Note that I have stated that for the comparison, we need to evaluate the logical concerns, not the platform-specifc concerns. This is c.d.t., not SQLSever Central or the thousands of similar websites. The issue is subtype implementation, not how SQL Server of a particular version works. Try to maintain focus.
Post by h***@gmail.com
Post by d***@gmail.com
Post by h***@gmail.com
When inserting or updating a single row, that won't be a big problem. But what happens when inserting or updating thousands or even millions of rows in a single statement?
That is a very stupid thing to do, for any scenario, not only for Subtypes, because it will blow the transaction log. ANSI/IEC/ISO SQL is transactional. So whatever proscriptions apply to such attempts, apply to Subtype insertions as well, no more, no less. Each platform will handle that differently.
You think executing an INSERT ... SELECT is a stupid thing to do?
Or an UPDATE ... WHERE?
I didn't state that. Read again.
Post by h***@gmail.com
Those are statements that potentially affect thousands or even millions of rows. I personally prefer to use them over writing code to loop over rowsets one by one - a single statement is less lines of code, easier to understand, and (at least on SQL Server) performs orders of magnitude better - as long as you don't do any stupid things.
Evidently you are used to very small databases, very small commercial deployments, and single user applications. You probably don't need a RDBMS platform. The C: drive will do nicely. I don't know why you take so much interest in RDBMSs.
(no idea what a "rowset" is.)
In large corporations, with large databases, that are hosted on large servers, with thousands of online users, much as a fresh developer would like to use one statement to affect millions of rows, they are not allowed to do so. The auditors (we have them in large companies) will fry him in oil, and feed him to the hundreds of users who are baying because they are hung up on the locks, and the lock duration.
No, we have measures to prevent murder and aggravated assault.
They are called standards.
The Auditor will make a declaration such as "maximum transaction size is 1000" or "maximum locks per connection is 100" or both. If the fresh developer is so stupid, that he can't figure out the four lines of code required to convert his one statement that affects multi-milion rows (MS people love their their big numbers, don't they) to run in a multi-user environment, we take him aside and teach him. The developers learn all the required standards before they submit code for promotion to dbo, and the DBA (policeman) knocks it back if it doesn't comply. There are levels of testing from Dev Test to UAT to Stress test on Production environment, before the code gets to Production, so the auditors will catch whatever the DBA misses.
We don't hire Mentally Subversive types. They spend hours speculating and they get nothing done, but the worst part is, they try to invent their own "standards" from scratch, as if the universe began on their birthday.
It is up to people like me to write the standards. I did that 20 years ago, and they have not changed, they are platform-independent. There is a small platform-dependent component, but because the two vendors provide 100% backward and forward compatibility (I know, it is magic, but I don;t have time to explain it), that too, hasn't changed in 20 years. I supply both a commentary with pictures and everything, and code templates. If a developer keeps breaching the standards (caught at promotion to dbo, anyway), he gets shown the door.
So there are perfectly simple ways (four lines of code, available in a template, for those who can't remember them), tried and tested, decades old, running in thousands of servers, that affect millions of row, executed in batches of 1000 rows, with no harm to performance, no impact on thousands of users, and no impact on the log file.
For database maintenance, such as changing the schema, and populating new tables from existing tables, by all means, skip the four lines of code, and execute the whole thing in a single transaction. But when you do that, you must take into consideration the platform-specific issues, out-of-scope for this thread, and the quality of the database implementation, otherwise it will fail, and you will have to keep repeating the steps, over and over, until you learn what those platform-specific and database-specific issues are. Then document them, and keep them in a safe place, because you will need it for the next maintenance job.
And, yes, that (not knowing your platform specific limitations) would be a very stupid thing to do. You know, come in on a weekend to perform db maintenance; code and test the changes; set 32 parallel threads up; start them off; go for a long boozy lunch; come back in three hours; only to find that 29 threads crashed after 10 mins, three threads are still rolling back. MS type think that that is "normal". We call it subnormal.
So yes, what you say is important, but there are simple ways around those issues, without resorting to the extremities that you identify. But the platform issues are more important. And they are irrelevant to this thread.
Post by h***@gmail.com
For SQL Server, using scalar user-defined functions qualifies as a stupid thing.
So don't use them!
Post by h***@gmail.com
I don't see the relevance of your remark about log space. If I need to insert millions of rows in a single transaction, I'll need the same log space whether I use a single INSERT ... SELECT statement or a million INSERT ... VALUES statements. (In fact, the former case would probably use LESS log space, depending on how the vendor implements the transaction log).
See above.
Well, my two vendors use about 1% of the log space, but still, there are issues re getting up to currency. But platform-specific issues and performance for new developers are all irrelevant to a logical discussion about subtypes.
Post by h***@gmail.com
Post by d***@gmail.com
Post by h***@gmail.com
This probably depends on the chosen DBMS. I know that SQL Server is unable to inline scalar user-defined functions into an execution plan, so for SQL Server, the only option is to execute a seperate call to the function for each row (or at least for each unique set of input parameters -which in this case would be the same, since the primary key is included in the input parameters-). So insert a thousand rows, and instead of converting the subquery that is hidden in the function to a join, SQL Server will have to perform thousand executions of the subquery. Your performance will, excusez mon French, suck.
Excuse me, no, *your* performance will suck.
*Your* as in: the performance of your code on SQL Server.
No, *your* as in, _your_ fixation about low-level performance of _your_ code on the _only_ server you know. Qualified professionals have no such problems. Recall the context. You made a declaration about code performance, and you provided excruciating detail about the performance of one plaform.
First, it is irrelevant to a logical comparison of subtypes (the thread).
Second it is relevant to only to one platform. (FYI, although not relevant to the thread, the two that I know well have no such problems, my code does not suck, it flies.)
Third, if God ever cursed me, the way so many others are cursed by HIM, and I had to implement Subtypes on MS SQL, I would spend some time and get to know what works well, and what does not, then I would implement Subtypes, using the best method, based on knowledge and actual testing. I would not be writing reams on c.d.t or SQL Central; I would not have to have long discussions; I would not be speculating and contemplating; I would just do it.
And whatever I do, it would be standard SQL (not pedantic or obsessed about syntax).
And whatever I do, it would *NOT* have one redundant column plus one additional index per subtype, plus one additional index per supertype. No, I couldn't justify that to anyone, unless they came from an asylum.
And I wouldn't write about it on c.d.t or SQL Central.
Unless some idiot posts a "solution" that uses twice the indices necessary and has redundant columns. Sure, then I would post. On second thought, no, I won't post, because if I do, I would have to spend hours (days?) explaining the to MS people who don't understand the simplest things about their job are who are obsessed about things that do not matter.
Post by h***@gmail.com
Post by d***@gmail.com
My performance (DB2 and Sybase) flies. I already have close to one hundred standard Subtype structures in production. Millions of transactional inserts have been tested. Probably a thousand full table unload/reloads (thousands to millions of rows each) have been tested.
Great! Have you done a performance comparison test of your method versus "the Joe construction"?
1. "Construction" ?!?!? Are you crazy ? No, don't answer that.
The English word (unless you have a private definition) has an established meaning. Construction Engineers, even bricklayers, would not do that. Construct means build, not destroy, not build using twice the resources. Part (of the reason is, they get paid. The customer is not going to approve a house with an extra bedroom, or a shelving unit with an extra cabinet. No. The brickie or the construction engineer cannot deliver it free, and the customer is not going to pay for something they did not order. We have laws in some western countries for this purpose.
Celko gives his cancer away free. MS types suck it up, and implement monstrosities "free", they get paid the same if their code runs twice as slow, or if it needs a server twice as powerful. There is no pride. but they will print off reams of paper from SQL Central, "proving" that their "construction" is widely used (it is, like cancer), and that it is "standard" (cancer is standard in the MD world, they invent a new one every time they find a cure for the old one). They never read c.d.t.
I construct things, in such a way, that they never need to call me back to fix it. MS types need their future job security, so they are quite happy to place little bombs and cancers everywhere.
2. Test
No, I have not done a test on Joe's cancerous method on my platform. That would be very stupid. Given all the benchmarks that I have done, that include Functions, and indices, and duplicate/redundant indices, I already have a good idea of the logical cost. My memory has not failed. I do not take drugs. I do not need to test the same thing, from scratch, every time, to find out, gee whiz, it produces the same results.
I know what the result of one redundant index per subtype plus one redundant index per supertype minus one function is.
Perhaps that is the difference between people who were qualified in universities 35 years ago, and those who get qualified now. Perhaps that is the difference between "logical" and stupid (those who have to test the physical, over and over, from scratch, because they have no logic).
Post by h***@gmail.com
Post by d***@gmail.com
When you get into contemplating that level of "performance" problem, it becomes very platform dependent, and dependent on the quality of code, for both transactional inserts, and streamed inserts. We are better of if we limit this thread to logical issues and measurements, and keep away from physical implementations of those logical constrcuts, which are heavily dependent of platforms, their specific limitations, etc.
The same can be said for storage overhead. Unique constraints don't have to be implemented by adding an index. Foreign keys don't have to be implemented by adding a column (some vendors use pointer chains instead). Etc.
So you haven't heard about "A Relational Model of Data for Large Shared Data Banks" by a gentleman named Dr E F Codd. It is a worthwhile paper to read. It might cut your work in half. I might eliminate your concerns about irrelevant matter. No, forget it, you are too obsessed with irrelevant matter.
Post by h***@gmail.com
Also, it is quite surprising to see someone who said "you are both fools" to Joe Celko and Roy Hann because they advocate a subtype/supertype solution that, allegedly, has 220% overhead now saying that we should keep away from physical implementations.
Not worth responding to. That is your tail that you are chewing on. If you keep eating in that direction, you will swallow ... never mind.
Post by h***@gmail.com
If we look at the logical side only, not caring about implementation details such as storage or performance, both solutions are good. They both achieve what they set out to achieve - ensure that there are no integrity violations in a subtype/supertype relationship.
That's all ?
You call that "logic" ?
Must be a private definition.
See my response [2] re Test.
Post by h***@gmail.com
If we look at implementation efffect as well, we should look at all effects. In this case, the effects are on storage and on performance.
Well, you could look at storage and performance logically, without getting into platform-specifics. But that does not appear to be in your private definition of "logical".
Let me guess, you are an all-or-nothing kinda guy. Black-or-white. Consider either no "effects" at all, or consider every possible "effect", logical, physical, platform-specific. And he he, you don't charge for it, it is "free". I am qualified in IT, I am not qualified in psychiatry, sorry, I cannot help you.
Post by h***@gmail.com
Post by d***@gmail.com
Actually, I have people who use my Subtype structure, and who have implemented it on MS SQL without any problems. So I can say for sure that (a) they did so without jumping through the hoops that you are speculating about, and (b) suffer no performance problems or "performance problems", but I do not know what MS SQL does internally in each version, to agree or refute your speculation about it. Therefore, I will state simply, I seriously doubt that that is the case for MS SQL.
I'll try to find the time to do a simple test. If I find the time, I'll post repro code and test results in a later message.
If it takes more than 15 mins total development time, or one minute execution time, to load 16 million rows with the DDL I have already posted, Product plus two subtypes, please note, _you_are_doing_something_wrong_.
Come back here and post, or email me direct, and I will help you.
Last time I ran a very similar benchmark, for an Oracle guru, 2 x 16 million rows inserted took 13.2 secs, and that was on a tiny demo box (laptop), I did not have access to my dev box at the time. Because Oracle carks it with subqueries, the benchmark on that side never finished; the modified version that used derived tables took 30 something mins. Given your statements re row-processing to avoid function calls per row, I won't tell you what to do, just figure out whatever is the best for your particular version of your particular platform, and do it.
Post by h***@gmail.com
Post by d***@gmail.com
Post by h***@gmail.com
Another possible problem you have to be aware of with your solution is that, because the subquery is hidden in the UDF, it is unknown to the DRI mechanisms. Some scenarios will allow violations of the CHECK constraints, becuase the RDBMS does not know it has to re-check.
Excuse me, but it *IS* a DRI CHECK Constraint. Please read the code again.
"Some scenarios" is not correct, no, scenarios implies the general case, not platform specific. What you are talking about is how a particular platform operates, not scenarios. So the honest statement is, "some platforms will allow violations of the CHECK Constraint".
I don't have to read the code again. I know what it does.
Well, then, you should know that it is NOT an "unknown DRI mechanism" AFA SQL Standard is concerned. Please be informed that it is a normal, visible, non-secret SQL DRI Constraint. The subquery is not hiden or "hidden" either.
You may know what it does on your platform, and you will not listen to what it does on other platforms, but you make statements anyway.
Post by h***@gmail.com
Will the sequence of two INSERT statements and one UPDATE statement cause a constraint violation on any of the platforms you use? On SQL Server, it won't. The UPDATE will be accepted, causing the data in my table to violate the subtype/supertype rules. The corresponding sequence of two INSERT statements and one UPDATE statement against Joe's implementation *will* cause an error to be raised.
I have already explained at length, how my two platforms handle it. Read the posts again, 'cause I am not going to type it again.
Post by h***@gmail.com
This can be fixed - I would need to add a CHECK constraint on the Type column in the supertype table that checks that the corresponding primary key value does not exist in any of the subtype tables that are mutually exclusive with the one indicated by the type value. for a supertype/subtype relationship with (n) mutually exclusive types, that requires (n-1) subqueries.
I have already warned against circular references. But you appear to insist on it. That is your insane "logic", nothing to do with me.
Post by h***@gmail.com
Post by d***@gmail.com
Yeah, sure. It appears that you do not understand context.
1. Joe Celko posted a generic framework that can be used to protect integrity in a subtype/supertype relationship;
2. You called him and Roy Hann "fools" for proposing and endorsing that framework;
Half-truths are lies. I gave an explanation, and pointed out the excess baggage.
Post by h***@gmail.com
3. You then posted your, supposedly better, alternative; and
No, no, no ! I didn't !
It is only better, or less overhead, or standard, only to logical IT qualified people.
Not for the rest. Your statement implies better overall, for everyone, and that is simply not true. I have evidence on c.d.t, of a couple of people who are absent logic, who have private definitions of "logic", that prove it it not "better" for everyone.
Post by h***@gmail.com
4. I assumed (yup, did it again!) that this alternative was designed to meet the same goals as the original: being a generic framework to protect integrity in a subtype/supertype relationship.
I cannot see how it was an assumption, my declared that in writing.
Post by h***@gmail.com
If the code you posted is *not* intended to be a generic method for subtype/supertype relationships, or if protecting integrity is *not* what the code sets out to do, please say so now - that would save both you and me a lot of time and effort!
As stated several times, but I will repeat for your convenience, since it is still unclear to you, my posted solution is:
••• Standard (Relational, if you can read the Codd paper)
••• Standard (SQL)
•••••• excludes non-SQLs
•••••• you need to figure out your platform-specific issues and limitations
••• generic
••• protects both data and referential integrity (I did give exact methods, for logical people)
Post by h***@gmail.com
Post by d***@gmail.com
1. The example code provides the implementation details for standard Subtypes. There is no suggestion that it does anything more than that. It is not a full blown application.
Ah, there's part of the answer already.
Post by d***@gmail.com
2. The Supertype-Subtype structure is supposed to be considered as a single unit, not as a bunch of tables with disparate and singular relationships between them. You should not be updating anything in the cluster without considering the whole cluster, transactionally.
Agreed that I shouldn't. Just as I should not delete a "parent" (hate those terms!) that still has a "child" in a foreign key relationship. But if I do so anyway, I trust the FOREIGN KEY constraint to slap me and prevent the change. Similarly, I expect whatever construct I use to guard the supertype/subtype relationship to slap me and prevent the update if I do something that would break integrity.
Yes. Weird. You do appear to understand some things, but strangely, not others.
Post by h***@gmail.com
The code I posted proves that there is at least one case on at least one platform where it doesn't.
No, it doesn't. Read [1] and [2] again, slowly.
Post by h***@gmail.com
Post by d***@gmail.com
3. When writing an application, you are best advised to implement all inserts/updates/deletes as transactions; you should never write singular inserts/updates/deletes.
When writing singular inserts/updates/deletes, a transaction is implicitly started. SQL Server (and probably other vendors too) support two options on when these are committed - implicitly at the end of the statement (if no constraint was violated), or only after an explicit COMMIT statement.
Too silly to respond to. And if I do, I would have to provide an entire tutorial on SQL transactions. There are many course on the subject.
Post by h***@gmail.com
But I fail to see the point. Constraints (and I consider both your and Joe's code as a way to work around a limitation of a constraint, so I expect the same behaviour) should always work. Even when application code does not follow best practices. (In fact, in that case constraints are even MORE important!)
Oooooooookay.
Post by h***@gmail.com
Post by d***@gmail.com
5. The example code, being an example of Subtype implementation, uses singular inserts/updates/deletes, rather than transactions, as is commonly accepted in posted examples.
So why slap _me_ for using singular inserts/updates/deletes in my posted examples?
I didn't slap you, I informed you that one had a transactional context (there's that word you do not understand again) and the other did not. And it is for you (assuming you can write the required code), to apply the transactional context when required, especially after that it pointed out to you.
Post by h***@gmail.com
Post by d***@gmail.com
11. If you have trouble writing the required Declarative Constraints and transaction code (no offence, but this structure is new to you, and you do not understand it yet) that is required for ordinary Data and Referential integrity, that you are implying is part of the Subtype integrity, at your request, I would be happy to supply it. Using Standard SQL, and you will have to make the minor syntactical changes required for your platform. i won't be posting transaction basics or tutorials, I will assume the reader knows that, I will post just the code.
This is now the fourth time in a single post that you hint at my incompetence. You are free to believe anything you want of me, but saying this out loud and then repeating it multiple times does not really add anything to the credibility of your arguments.
Er, I have not been "arguing" re my submission. I have not stated anything new since my submission re Subtypes.
All the voluminous posts since then, have been responding to your voluminous questions and discourses, which have little to do with the submission. Which proves your level of being able to learn something new and to genuinely apply it on your platform without failing, and which proves your incompetence, yes. ANy "argument" in that content, is about your issues.
Post by h***@gmail.com
Post by d***@gmail.com
Post by h***@gmail.com
Bottom line: I like your solution. But it's not 100% safe, and though you avoid the storage overhead, you pay for this with a performance overhead. I'd recommend people to choose between the two solutions on a case by case basis.
I refute the "not 100% safe" on the basis described above: implement whatever safety you need, that is not shown in , and cannot reasonably be expected to be shown in, the example of Subtypes.
Your code as posted does not provide 100% safety against integrity problems in the supertype/subtype relationship on at least one platform. The code I provided will result in a supertype indicating a CD, with a row in the table for the Book subtype.
I cannot help you if you cannot read the posts and understand them
Post by h***@gmail.com
Before starting this reply, I spent a lot of time trying to decipher the relevant sections of the SQL:2003 standard (not the latest, I know, but I have to make do with what I have). If I didn't misunderstand anything, your code (or at least the ANSI-standard equivalent code) will provide 100% safety in a fully compliant RDBMS, because (again, if I understand completely) the standard says that the database should re-check all CHECK constraints for all rows in all tables after any modification - or at least behave as if it does.
Name-dropping won't help you.
I have already explained in detail, we have no deferred constraint checking, we do not need it, we are not so stupid as to write circular references.
Post by h***@gmail.com
I have no experience with other mainstream RDBMS products, but I would be very surprised if it turns out that all other vendors do manage to pull that off with CHECK constraints that use reference other tables through a user-defined function. (But I am not excluding the possibility - I have been surprised before).
Evidently you cannot read. If you do not trust me, or the submitted code, just say so and stop this voluminous nonsense.
It is not reasonable to not believe what you (admittedly) do not know, but then hey, you are not logical either.
Maybe phone someone and check, or post a question on the Sybase newsgroup. (let me know, I would like to make sure that you have not screwed up the wording and sabotaged yourself).
Post by h***@gmail.com
For those vendors that, like SQL Server, only check a CHECK constraint for inserted and modified rows, an extra CHECK constraint and an extra function would be required. I sketched a rough outline of this extra function above, but I'll be happy to provide code if you want me to.
The "not safe" refutation stands. Your concept of "not safe" is limited to the only plaform you know, and an inability to port my standrad code to your platform.
[3] No, I do not need your code. Just do not use my code on your platform.
Others who use your platform, and who do not have your level of skill, can use my code on your platform. They will not require 20 pages of explanation from me. ANd they will be able to get around any "advice" that you give them.
Post by h***@gmail.com
Post by d***@gmail.com
I refute the "performance overhead", as detailed above, but I cannot be specific because you have not identified specifically what the alleged "performance overhead" is.
I replied to this above.
No you did not, in the logical context. Yes, you did go on for days about irrelevant platform-specific issues whichyou could not figure out for yourself (and I cannot help you with). That is limited capability or incompetence or whatever you call it. That is not "performance overhead" logically. That is performance overhead for a person who does not understand the construct required, and who does not understand his platform enough to implement it without making a disaster of it.
The disaster is yours, it has nothing to do with the logical construct submitted.
Post by h***@gmail.com
Post by d***@gmail.com
a. you understand it, which clearly, has not been achieved yet
And there's number five. :)
Yes. Supported by volumes of evidence produced by your own hand (ie. excluding my statements). We've entered double-digits in this post.
Post by h***@gmail.com
Post by d***@gmail.com
b. you have actually implemented it and tested it (both transactional inserts/updates/deletes, and singular streams inserts, etc), and you have gained actual experience, and some basis for confidence
As I said above - if I find the time, I'll run some tests on SQL Server 2012 and post back with the results when finished.
Whenever you have fifteen minutes.
Post by h***@gmail.com
Post by d***@gmail.com
Until then, your recommendations are based on contemplation, not facts, not experience.
And until you run an actual side-by-side comparison test of your version against Joe's version, the exact same can be said about your recommendations.
I did not recommend anything. I submitted a solution for qualified IT people. They can figure out the logical cost. And if they are incompetent, they will have to get into the physical, and then they will figure out the physical costs. But if they are stupid, they will implement a disaster of a test, and then form a conclusion that it does not work. Then they will post on a blog somewhere on SQL Team or SQL Central that it doesn't work.
So for you, I am telling you straight, do not use my solution, use Joe's. Do not waste any time "testing" it, no matter what you do, it will fail.
Post by h***@gmail.com
(BTW, the only recommendation I actually made so far is to "choose between the two solutions on a case by case basis")
I did read that, yes. That was the recommendation that I recommended against. It is based on incompetence and fear of the unknown, not on knowledge.
Post by h***@gmail.com
A quick reply (I really need to get going on some other stuff, but I can't let this slip) to some issues.
None of your replies are quick, you can stop lying to yourself.
Post by h***@gmail.com
Post by d***@gmail.com
We can implement Declarative Constraints to ensure that a child cannot exist without the appropriate parent (CONSTRAINT Child.FK REFERENCES Parent.PK), but we cannot do so to ensure that at least one child exists for each parent. How do you implement that ? In Standard SQL, please, not worrying about syntactical differences.
In standard SQL, this *is* in fact possible with declarative constraints. On the OrderItems table, add a constraint CHECK (EXISTS (SELECT * FROM Orders WHERE Orders.OrderID = OrderItems.OrderID)) [actual syntax may be slightly different; I never worked on a product that supports subqueries in a CHECK constraint so I don't know the exact syntax for implementing the correlation]. Then make this constraint deferred (unless you want a very complicated way of making the two tables read-only).
In a real database, this only works if both subqueries in CHECK constraints and deferrable constraints (feature F271 in SQL:2003) are supported.
That is a contradiction, again, like so many others from you. Contradiction inside the one cranium is insanity. Which you appear to be quite sanguine with.
Real databases do not support deferred constraint checking. Real databases support standard Subtypes, with full Data & Referential Integrity (both directions, yes!) without deferred constraint checking. We don;t have the hilarious CASCADE either. Amazing. We don't read the standard and argue about it either, we just do it.
Post by h***@gmail.com
Since the databases I know don't implement those features, I agree that this business rule can not be implemented as a declared constraint.
Okay.
Post by h***@gmail.com
Post by d***@gmail.com
It is worth noting that the seriously deranged will implement an FK in the parent, that references the child, which results in a circular reference, and then insist that the platform must support "deferred constraint checking". These people have no sense of Rules, or Order of Rules. But those of us who don't have that form of insanity in the first place, don't have the demands in the second place. DB2 and Sybase do not have "deferred constraint checking"; we have enjoyed secured databases with full data and referential integrity for 35 years without it. Typically the platforms that provided "deferred constraint checking" did not have transactions, and were not SQL anyway. Some of them have implemented a weird form of non-ACID non-SQL "transactions" since then. Therefore as stated, SQL only, and no weirdness with non-SQLs that provide asylum facilities.
A foreign key instead of a check constraint would even be better, indeed. So, does this make me "fully deranged"? And if so - why?
Er, the foreign key that you suggest *IS* the circular reference.
1. Look the term up, and understand it.
2. Read the above paragraph again, slowly, with the knowledge that the foreign key that you suggest *IS* the foreign key that I identify in the first sentence, and that it *IS* the circular reference that I identify in the next clause.
If you still don't understand, then you need formal education, which I do not have time to provide.
Further, with your demonstrated technical ability, you would require at least twice the normal time.
Further, as stated already, your demonstrated mental condition is not something that I am qualified to address, and overcome, which is a pre-requisite to education.
Post by h***@gmail.com
If you happen to be working on a platform that does support this feature set, using the tools available to you would, in my opinion, be the smart thing to do.
Ooooookay. But that opinion has the same credibility as all your opinions of this page.
Post by h***@gmail.com
(You'd still need to provide the add and drop stored procedures and still need to use DCL to secure the database, but if you can use a declared constraint to express a business constraint, you'd be folly not to do it).
Ooookay.
Post by h***@gmail.com
If your "fully deranged" refers to people trying to implement this on a database that doesn't support deferred constraints and then blaming the vendor - well, in that case I agree with the sentiment you expressed, allthough I would have used different words.
Ooookay.
Post by h***@gmail.com
Post by d***@gmail.com
It is the same for a Subtype cluster, nothing more, nothing less.
I sense a misunderstanding.
I am fully aware that your function-based framework does not force people to insert a row in the ProductBook table for a product of type 'B', or to insert a row in the ProductCD table for a product of type 'C'.
Er, it does force exactly that !
You say you can read and understand code, but the evidence is, you cannot! Maybe you can read and understand isolated code snippets. But you definitely cannot understand the effect, the overall total logic, of all the posted code snippets working together in an integrated "system".
No you me asking you to "read again". You have already proved to be non-logical.
Post by h***@gmail.com
Neither does Joe's framework. (And I actually believe that to be a good thing - the business requirements could easily allow us to know about books or CDs for which the number of pages/tracks is unknown).
You are getting distracted and tangential again. One of Joe's tricks.
Post by h***@gmail.com
The code I posted before does indeed end up with product of type 'C' with no row in ProductCD, but that is not the problem I was trying to demonstrate.
No, it does not. You do not understand transactions. Read further.
The most accurate thing you could say is "If I [Hugo] implemented the code that you [Derek] posted, on my platform, given my abject understanding of the construct, and my complete and total lack of knowledge about Standard Transactions, would not prevent breaches in data & referential integrity." Which really means "I [Hugo]do not understand Dat & Referential Integrity enough to implement it correctly or consistently; when given some example code, I will stuff it up; I need the entire application coded and given to me."
Post by h***@gmail.com
The real issue with the code I posted is that, after running it, I have a product of type 'C', that still has a row in ProductBook. This is not allowed.
I agree it is not allowed. But the simple evidenced fact is, you have allowed it. Further proof of my point immediately above.
Post by h***@gmail.com
The framework does prevent this if the usual order of actions is followed (insert supertype; insert wrong subtype). But a different order (insert supertype; insert correct subtype; update supertype) is not prevented.
Barking up the wrong tree.
Maybe THREE circular references, that should do it.
Post by h***@gmail.com
Yes, I understand that I can code my way around the limitations. I can also write stored procs to ensure that a normal foreign key relationship is never violated. But I still declare the foreign key relationship.
Yes, that is the Standard practice. It has two separate and distinct putposes (scary for an all-or-nothing person!) But those who are not deranged state it in the reverse Order. Note the capital "O".
Post by h***@gmail.com
To me (and your opinion may differ), it is an either/or, with no in-between. Either I accept that my database has no out-of-the-box declarative method to ensure that the integrity of my supertype/subtype relationship is never violated;
Presumably you mean, via code.
In that case, yes, that is exactly what I advised. That would leave you with no extra indices, no redundant columns.
Post by h***@gmail.com
I accept this shortcoming, write access modules that do the work instead and be done with it.
Yes! Assuming "access modules"means stored procs with transaction control (no autocommit baby puke).
Post by h***@gmail.com
Or I find a creative way to use whatever tools I have available to still get the RDBMS to do that work. (And I'd still write the access modules, as a convenience for the app developers - but I also know that whatevery anyone does, my data's integrity is covered).
No, no ! Don't do that. Given your long discourse, you will probably prevent the access modules from working.
Post by h***@gmail.com
Joe's solution is the latter category. Yours is in-between - it uses functions and check constraints to prevent violation in some -admittedly the most common- cases, but leaves a back door open so that I still *need* those access modules, and *need* to ensure that they are 100% reliable in all cases.
Oh, really ?
That's a *neeed* that everyone has, in every cluster of tables, in which you *neeeed* to ensure data & referential integrity, because SQL declarations do not provide 100% data & referential integrity, and the "back door" is always open, all the time, for every pair of related tables. The whole village could be gang-raping you and you wouldn't know it, yes. And if you don't provide the access modules for *all* the tables that *neeed* them, you will have spinsters and orphans, yes. And father-less single mothers.
It is no more *neeedy* or less *neeedy* than any other pair of *neeedy* tables. The "back door" is the same "back door" for every table.
Disgusting.
Post by h***@gmail.com
Post by d***@gmail.com
Post by h***@gmail.com
... So insert a thousand rows, and instead of converting the subquery that is hidden in the function to a join, SQL Server will have to perform thousand executions of the subquery.
I am not sure that I understand the specific issue you are concerned about, so forgive me if I am wrong.
There's actually a combination of two issues. Both specific to SQL Server.
Finally, after 22 pages, an admission of what I said in the first place. Hallelujah! God is Merciful! Praise be to Allah!
Now please, we do not need to hear about what one idiotic, broken platform does or does not do with regard to performance.
Post by h***@gmail.com
1. < portions irrelevant to a logical evaluation of the subject <snipped> >
2. < portions irrelevant to a logical evaluation of the subject <snipped> >
Add 1 and 2 together, and ... < portions irrelevant to a logical evaluation of the subject <snipped> >
I repeat, for you, given your demonstrated competence on your platform *do *not* implement my Subtype submission*
Post by h***@gmail.com
Post by d***@gmail.com
But please don't take my word for it, you can always try it. Use my DDL, so as to save time, load up a Product table with 16 million rows (a thousand is nothing these days). It takes only 5 to 10 mins. In case it needs to be said, please do not use a cursor: not only is it suh-low, it will skew the result badly. I don't have anything MS or MD in my stable, otherwise I would offer.
As promised in my previous reply - I will test when I have the time.
Nah. Forget it. Honestly. Given the evidence above, it will fail. Trust me. I did a uni-level psychology once.
Post by h***@gmail.com
Post by d***@gmail.com
(I have heard that recent releases of MS SQL have an issue with Functions, depending on the content of course, but not the QP or the QP which includes a subquery. I think it has to do with large amounts of code in Functions. Therefore we should be safe with a single-statement function.)
Not just recent releases. All versions that support user-defined functions (which off the top of my head is all versions since SQL Server 2005) suffer from this issue.
Maybe they will fix it in the next century. I can't wait.
Post by h***@gmail.com
And the number of statements in the function is not related. See for example this blog post, where I demonstrate the performance difference with a UDF to triple the input value: http://sqlblog.com/blogs/hugo_kornelis/archive/2012/05/20/t-sql-user-defined-functions-the-good-the-bad-and-the-ugly-part-1.aspx
Nah, it is not for me. The article takes far too long to explain simple things that logical people can understand in one sentence.
----
Just to be clear, I did not say you are stupid. I said you were incompetent; contradictory; arguing dishonestly and hypocritically; insane (deranged, particularly). You relate things that are not related, and you fail to relate things that are related. Clever in that insane devilish Einstein sort of way, clever enough to avoid being locked up, but not clever enough to avoid killing people or contracting AIDS or hanging yourself on a rope that you are setting up to hang someone else. Dahmer, Bundy, et al, stayed under the radar for decades, all their friends said they were highly intelligent. They wrote books.
Celko is stupid. And many here and elsewhere initially informed me of that, for which I am grateful, before it was proved to me in several instances. He knows when to shut up and leave the premises. You don't.
Cheers,
Derek