Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 692 Vote(s) - 3.57 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Using DDD, How Does One Implement Batch Processing?

#1
I have logic consisting of selecting a large number of records from one system, performing multiple transformations (based on business rules) and inserting them into another system.

It seems like a high performance (and memory) hit to instantiate each of these records as an object, perform transformations on them and then insert all of these object into the other system.

Is the best way to achieve this in DDD to skip the classes/objects and do it straight through SQL, maybe a stored procedure?

Is there a better way using DDD to achieve this goal?

Note: The systems use SQL databases, at the moment object stores like CouchDB are not an option.
Reply

#2
A lot of distributed systems built on DDD are using an Event-Driven Architecture, where rather than waiting to perform all the transformations in one batch, as each entity undergoes the state change that would cause it to be transformed by your system, the entity raises an event that gets published to a message bus of some kind (e.g. Mule for Java, MassTransit for .NET). Your transformation system will subscribe to this events, and as each message arrives in your system, it will perform the transformation on the entity identified in the message and then publish another message to the destination system.

This kind of "trickle processing" can run continuously, all day long without putting the kind of load on your system that would necessitate the job being run after-hours. If you're concerned about performance, this kind of architecture might result in a system that has the last record transformed 5 minutes after COB, where a batch job might not even be able to run until 3 am (after all the other batch jobs have finished).

If you truly don't want the target system to be updated until midnight, e.g., just queue the messages up until midnight, and then publish them to the destination system's endpoint.

Greg Young has blogged and presented extensively on this kind of architecture. Check out his work on InfoQ.
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through