Search Content
find
CATALOG
Skip Navigation Links.
Expand Data InfrastructureData Infrastructure
Expand Data ToolsData Tools
Expand Desktops, Laptops and OSDesktops, Laptops and OS
Expand Enterprise ApplicationsEnterprise Applications
Expand IT InfrastructureIT Infrastructure
Expand IT ManagementIT Management
Expand Networking and CommunicationsNetworking and Communications
Expand SecuritySecurity
Expand Servers and Server OSServers and Server OS
Expand Software and Web DevelopmentSoftware and Web Development
Expand StorageStorage
CONNECTIONSmore
Connect with other members or invite your contacts to the community.
GROUPSmore
Join an existing group to participate in the group discussions or create a new group of your own to create discussions around topics of interest to you and your work.
MY RESOURCESmore
Find info or add new info to your ITmodelbook. You can find white papers, technology reports, business analysis, webinars, presentations and more. You can also share your own authored content and resources you like by adding this info.
Resources you may like
  • Share within ITmb

Debatching data is the process of turning one huge pile of data into many small piles of data. Why is it better to shovel one ton of data using two thousand, one pound shovels instead of one big load from a huge power shovel? After all, large commercial databases and the attendant bulk loader or SQL Loader programs are designed to do just that: insert huge loads of data in a single shot. The bulk load approach works under certain tightly constrained circumstances. They are as follows:
  1. The "bulk" data comes to you already matching the table structure of the destination system. Of course, this may mean that it was debatched before it gets to your system.
  2. The destination system can accept some, potentially significant, error rate when individual rows fail to load.
  3. There are no updates or deletes, just inserts.
  4. Your destination system can handle bulk loads. Certain systems (for example, some legacy medical systems or other proprietary systems) cannot handle bulk operations.
As the vast majority of data transfer situations will not meet these criteria, we must consider various options. First, one must consider which side of the database event horizon one should perform these tasks. One could, for example, simply dump an entire large file into a staging table on SQL Server, and then debatch using SQL to move the data to the "permanent" tables. There are, of course, multiple tools for debatching large bulk data loads including BizTalk Server and SQL Server Integration Services (SSIS). One can use such tools to break up large batches of data, manipulate it as needed, and send it on to its next reincarnation (for example, into an API, a relational database, or a text file). In this chapter, Packt Enterprise will take a look at options for processing large data sets.

Packt Enterprise books can be summed up through the tagline "Professional Expertise Distilled". They take in-the-trenches knowledge from experienced software professionals, and distil it into a single, easy to follow manuscript.

Keywords
Packt Enterprise, Debatching Bulk Data – Free 33 Page Chapter,
Offered by
Packt Enterprise
URL
Files
The resource is available from the link above.
Ask a question
search Paper Image Add papers image
Bookmark to
My ITmodelbook add
Group ITmodelbooks
 
'Apple iTunes'
'Microsoft Store'
 

Latest reports from top IT companies:

SAP HP Janrain HubSpot PrepLogic Motorola BNP Media Informatica Microsoft Jobvite