Author |
Message
|
Vitor |
Posted: Fri Oct 23, 2015 11:09 am Post subject: |
|
|
 Grand High Poobah
Joined: 11 Nov 2005 Posts: 26093 Location: Texas, USA
|
bdaoust wrote: |
They are going to create me my own broker to work with. |
It's a rare triumph for common sense!
Don't forget to tell us how it goes. _________________ Honesty is the best policy.
Insanity is the best defence. |
|
Back to top |
|
 |
bdaoust |
Posted: Fri Oct 23, 2015 11:11 am Post subject: |
|
|
Centurion
Joined: 23 Sep 2010 Posts: 130
|
hahaha
I'll keep you all updated.
So just to be sure - 1,200 elements in a DFDL is fine - that from a best practice perspective? |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Oct 23, 2015 11:13 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
The best practice is to make sure your DFDL models match your messages. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
bdaoust |
Posted: Fri Oct 23, 2015 11:18 am Post subject: |
|
|
Centurion
Joined: 23 Sep 2010 Posts: 130
|
Well I'm taking a large XML - using a graphic map to do a For Each with nested if/then and then mapping to an element within the DFDL model.
The requirement is that since they are going to be importing this to Excel, I need to maintain column integrity, so even if I have data for 100 of the 600 element (the other 600 is the header names) , they want everything. |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Oct 23, 2015 11:51 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
So your output message needs 1,200 fields.
 _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
bdaoust |
Posted: Fri Oct 23, 2015 11:57 am Post subject: |
|
|
Centurion
Joined: 23 Sep 2010 Posts: 130
|
Well 600. I was referencing 1,200 because in the DFDL the first sequence is the header and the second are the actual data fields.
So the header element have the default name set to what I want as column names.
Then I use a graphical map to map to the record sequence as I do a for each (stepping through the XML) and checking conditions to map to the record field if a criteria is mapped.
So because some of the maps will never happen due to a condition not being met, I had to create the header with default names set to ensure I maintain a constant number of columns since their Excel workbook will explicitly reference columns and expect them to be there regardless if data is present. |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Oct 23, 2015 12:01 pm Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
Yeah.
You might think if there's some way you can partition the model for different types of output - i.e. this output category only ever uses 20 columns.
Then you could make smaller models that had the header row as even a fixed string, and then the last part of each row as a fixed string of ",,,,,,"...
But if you don't know which fields are going to be used for any given message, then. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
bdaoust |
Posted: Fri Oct 23, 2015 12:19 pm Post subject: |
|
|
Centurion
Joined: 23 Sep 2010 Posts: 130
|
Ya the input and output will always remain the same. I even asked why they need a header. I mean, if it's always going to be the same, you would think the receiver can just have their header and i send them the data.
I noticed that if I took out the header sequence, deployment is definitely faster. I'm wondering if it's worth to have two different DFDLs (a header and then the records) or just build a string in ESQL and attach it to the final output. But that just seems like a bad thing to do. |
|
Back to top |
|
 |
mqjeff |
Posted: Fri Oct 23, 2015 12:28 pm Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
It might compile the individual DFDLs faster, if they were separate. But you'd need a third DFDL that included both files so you could build a message with a header and a body. That still might take as long to compile (on the deployment side)
It's worth a try, at least. _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
bdaoust |
Posted: Fri Oct 23, 2015 12:31 pm Post subject: |
|
|
Centurion
Joined: 23 Sep 2010 Posts: 130
|
Hey why not. I'll give it a whirl and report back.
Isn't a DFDL just an XML behind the scenes? What happens behind the scenes when a deployment happens that would cause such a hangup? |
|
Back to top |
|
 |
timber |
Posted: Fri Oct 23, 2015 12:35 pm Post subject: |
|
|
 Grand Master
Joined: 25 Aug 2015 Posts: 1292
|
Quote: |
1,200 elements in a DFDL is fine - that from a best practice perspective |
The number of elements is not the issue. The design of the model ( both the XSD and the DFDL annotations ) may well be part of the problem. Let me explain...
You could have 1200 elements, with names 'element0001' to 'element1200'. Each element contains *exactly* the same set of 10 child elements. The xsd repeats the definition of those child elements under each element declaration. Is this best practice? Of course not!
Similarly, you could have 1200 elements, each with ( almost ) the same DFDL properties. If the DFDL schema repeats the same property values on every element then is that good practice? No. Is there a way of avoiding duplication in DFDL properties? Yes! ( format blocks, although the DFDL editor doesn't do a great job of advertising them ).
Hopefully, you get my drift. I can't be more specific without a bit more detail about your 1200 elements and their DFDL properties. Please don't attempt to post the entire xsd, though
Quote: |
because some of the maps will never happen due to a condition not being met, I had to create the header with default names set to ensure I maintain a constant number of columns since their Excel workbook will explicitly reference columns and expect them to be there regardless if data is present. |
More details, please. DFDL can automatically insert delimiters ( e.g. commas ) for missing fields if you set up the model correctly. You don't need to create all of the 'missing' fields in the message tree. |
|
Back to top |
|
 |
bdaoust |
Posted: Wed Oct 28, 2015 8:45 am Post subject: |
|
|
Centurion
Joined: 23 Sep 2010 Posts: 130
|
Sorry for the delay on this.
So I'm thinking the slow deployment might be because when a BAR is deployed and has a DFDL - there is a schema validation that occurs?
Is that true? And if so, can a bad schema not throw errors but be the cause of the slow deployment ?
I'm going back to the CSV samples and see if I can see what I may be missing.
All I want is a DFDL that I can map to. The DFDL should allow 600 fields. If a field is missing at the time of parsing, it should have an empty value. All fields should have , as the delimiter and all fields need to have a specific named header to identify the column when it's imported to Excel. |
|
Back to top |
|
 |
mqjeff |
Posted: Wed Oct 28, 2015 9:28 am Post subject: |
|
|
Grand Master
Joined: 25 Jun 2008 Posts: 17447
|
The dfdl has to be compiled during deployment.
Errors in the DFDL can cause the deployment to fail, but I wouldn't think it would cause the deployment to run slower... _________________ chmod -R ugo-wx / |
|
Back to top |
|
 |
stoney |
Posted: Wed Oct 28, 2015 9:47 am Post subject: |
|
|
Centurion
Joined: 03 Apr 2013 Posts: 140
|
When you deploy XML Schema files, they are validated and then compiled into a binary representation suitable for use with the XMLNSC parser.
When you deploy DFDL Schema files, they are validated and then compiled into a binary representation suitable for use with the DFDL parser.
Because DFDL Schema files are also valid XML Schema files, they are also validated and then compiled into a representation suitable for use with the XMLNSC parser.
The validation and the compilation can be expensive in both CPU time and memory requirements - depending on the number, size, and content of those schema files.
In V10, you can put your schemas in a shared library, deploy that shared library once, and then reference that shared library from an application.
You can deploy the application as many times as you like, and it won't recompile the schemas - unless you redeploy that shared library. |
|
Back to top |
|
 |
bdaoust |
Posted: Wed Oct 28, 2015 1:20 pm Post subject: |
|
|
Centurion
Joined: 23 Sep 2010 Posts: 130
|
]
Quote: |
The validation and the compilation can be expensive in both CPU time and memory requirements - depending on the number, size, and content of those schema files. |
I'm definitely finding this out - that CPU and memory utilization while deploying is awful. To the point where I time out deploying. And my schema only have 1,200 elements. I wouldn't think that is a huge amount of elements, but what do I know
Quote: |
In V10, you can put your schemas in a shared library, deploy that shared library once, and then reference that shared library from an application.
You can deploy the application as many times as you like, and it won't recompile the schemas - unless you redeploy that shared library. |
Any work around in version 9? Up above it was suggested to deploy fix packs, which I'm awaiting my own local broker to try that. |
|
Back to top |
|
 |
|