Introduction to Big Data
Introduction to Big Data
The essential problem of dealing with big data is, in fact, a resource issue.
Because the larger the volume of the data, the more the resources required, in terms of memory, processors, and disks.
The goal of performance optimization is to either reduce resource usage or make it more efficient to fully utilize the available resources, so that it takes less time to read, write, or process the data.
The ultimate objectives of any optimization should include:
Maximized usage of memory that is available
Reduced disk I/O
Minimized data transfer over the network
Parallel processing to fully leverage multi-processors
Four key principles for designing or optimizing your data processes or applications