How to improve performance by processing database results in parallel?
I have a .net application which runs in the region of 20 to 30 SQL queries and processes the results 1 at a time. I have been trying to increase performance by doing some work in parallel.
2 of the queries take 75% of the time, purely because of the amount of data they return. My initial experiments have been to try to split these queries into 4 buckets using ntile and process each datareader in parallel. If anything this takes a lot longer, I think because of the extra work involved using NTILE + querying the DB 4 times instead of 1.
Can anyone suggest other techniques to try or am I just wasting my time here? The code below is part of a utility class which allows me to queue up the functions which process the reader. So using my NTILE experiment I queue up 4 tasks each processing 1/4 of the data (where ntile =1, 2, 3, 4) and call Execute to run them in parallel.
foreach (var keyValuePair in m_Tasks)
var sql = keyValuePair.Key;
var task = keyValuePair.Value;
var conn = new OracleConnection(ConnectionString);
var cmd = conn.CreateCommand();
cmd.CommandText = sql;
var reader = cmd.EndExecuteReader(a);
DateTime endIO = DateTime.Now;
Console.WriteLine(TaskName + " " + Thread.CurrentThread.ManagedThreadId + " IO took: " + (endIO - startTime) + " ended at " + endIO);
DateTime taskStart = DateTime.Now;
DateTime endTAsk = DateTime.Now;
Console.WriteLine(TaskName + " " + Thread.CurrentThread.ManagedThreadId + " TAsk took: " + (endTAsk - taskStart) + " ended at " + endTAsk);
if (Interlocked.Decrement(ref numTasks) == 0)
DateTime endExecute = DateTime.Now;
Console.WriteLine(TaskName + " " + Thread.CurrentThread.ManagedThreadId + " EXECUTE took: " + (endExecute - startTime) + " ended at " + endExecute);
Thanks for any help.
I have tried to write a parallel implementation of mergesort using threads and templates. The relevant code is listed below. I have compared the performance with sort from the C++ STL. My code is 6 ti
I need to know abt how indexing in mongo improve query performance. And currently my db is not indexed. How can i index an existing DB.? Also is i need to create a new field only for indexing.?.
I'm using phonegap to access a database on the device to perform some inserts, however I'm getting less than ideal performance. Questions: I see that there are some SQLite plugins for phonegap out th
How do i increase the performance of below linq query? While running it, it threw an error of System.OutOfMemoryException. Note: I have a lot of records in XrmContext.sun_POSSet entity var a = (from t
So I have have pytest run my tests and that's great, but I want to actually do something with the test results. I was using unittest, and that gives me a swanky results object that I can process after
I am trying to develop a Recursive Extractor. The problem is , it is Recursing Too Much (Evertime it found an archive type) and taking a performance hit. So how can i improve below code? My Idea 1: Ge
I am working on performance optimizing for our legacy application. It use VC++ 2008, OS is WindowsXP or above. In installation, it will parse a file and write some information about the file into regi
I am doing my first steps with Cython, and I am wondering how to improve performance even more. Until now I got to half the usual (python only) execution time, but I think there must be more! I know c
We have our database servers separate from our webserver. The database servers are replicated (we know there is overhead here). Even with replication turned off however, performance for large number o
I built a sample program to check the performance of tasks in parallel, with respect to the number of tasks running in parallel. Few assumptions: Operation is on thread is independent of another threa
The below script takes more than 10 hours to execute.. It contains three nested cursors and i think they are the main culprit. I have searched a lot for replacing the cursors or improve the performanc
I want to process data in parallel using a cluster of ServiceMix / ActiveMQ / Camel. It seems I can achieve that by first splitting the data up, then distributing it via multiple JMS messages and an A
i'm working with big xml file and need to download and parse him . inside 65k objects, but parsing is more then minute. I cannot understand how to optimize loading/parsing, please help me with advice.
I have a little knowledge for databases designing and SQL only. I wrote a simple students scores manager for learning programming and database, but I had terrible designing. Database Struct My databas
I am reading a CSV file and saving the data from the CSV file to my database. I'm using Streamreader's ReadLine() to read every line and then insert it into my database, which is working fine. But aft
I have only 3237 records in the database and I used UISearchDisplayController and NSFetchedResultsController for search. but when I type the keyword for search it's extremely slow: 2012-06-26 10:26:25
I'm using the phrases Parallel Processing & Multi Threading interchangeably because I feel there is no difference between them. If I'm wrong please correct me. I'm not a pro in Parallel Processin
this code is running on a file of 200M lines at least. and this takes a lot of time I would like to know if I can improve the runtime of this loop. my @bin_lsit; #list of 0's and 1's while (my $line
I have written a procedure which takes several input parameters. I need to validate the input. If they are valid values they have to be present in the database table - attribute_values. The problem is
What are some things I can do to improve query performance of an oracle query without creating indexes? Here is the query I'm trying to run faster: SELECT c.ClaimNumber, a.ItemDate, c.DTN, b.FilePath
I have built classified website using Yii php framework. Now it is getting a lot of traffic. So I want to using caching to optimize the performance of the website. There are two controllers I want to
I was plaing with parallel Haskell functions par and pseq and I have discovered something interesting. My examples base on the examples from Real World Haskell's book (Parallel programming in Haskell)
Hello I am trying to run a program that finds closest pair using brute force with caching techniques like the pdf here: Caching Performance Stanford My original code is: float compare_points_BF(int N,
Cassandra doesn't have some CQL like like clause.... in MySQL to search a more specific data in database. I have looked through some data and came up some ideas 1.Using Hadoop 2.Using MySQL server to
I use a chinese custom font-face called YaHei. For the implementation I used http://fontface.codeandmore.com/, but the font face file is 10.01MB. Is there a way to compress the file or to improve the
Ho do I configure HBase so that the scanner only retrieves a number of records at a time? Or how do I improve the scanner when the database contains a lot of records/
Awk processes the files line by line. Assuming each line operation has no dependency on other lines, is there any way to make awk process multiple lines at a time in parallel? Is there any other text
We already have parallel fan-out working in our code (using ParallelEnumerable) which is currently running on a 12-core, 64G RAM server. But we would like to convert the code to use Rx so that we can
I am seeking for advice, how I can improve this in terms of speed: My Data-model: class Events(ndb.Model): eventid = ndb.StringProperty(required=True) participants = ndb.StringProperty(repeated=True)
I am using the task parallel library from .NET framework 4 (specifically Parallel.For and Parallel.ForEach) however I am getting extremely mediocre speed-ups when parallelizing some tasks which look l
I'm a student and we are making a simple information system for a hospital. How can we improve the security of mysql database so that confidential information will be protected.
I'm new to WCF Data Services so I've been playing. After some initial tests I am disappointed by the performance of my test data service. I realize that because a WCF DS is HTTP-based there is overhea
I would like to know how are the performances of Processing sketches in Android. Here is the link for more info about Processing-Android : http://wiki.processing.org/w/Android#Instructions I don't rea
How fastCGI will improve PHP performance, is it recommended to use for my typo3 CMS. Will it produce any side effects.?
The code look like this sum += array[j] + array[j+1] + array [j + 2]+ ... array[j + n]; how do I replace the j+n inside the bracket to improve the timing?
here is my query taking nearly 20 mins. pls suggest me changes to increase performance SELECT DISTINCT CONVERT(varchar(10),x.notice_date,120) Date, Y.branch_name, count(case when x.status='broken' and
To put it simply i am a fairly new PHP coder and i was wondering if anyone could guide me towards the best ways to improve performance in code as well as stopping those pesky memory leaks, my host is
I am working on people detecting using two different features HOG and LBP. I used SVM to train the positive and negative samples. Here, I wanna ask how to improve the accuracy of SVM itself? Because,
I've a query that performs join on many tables which is resulting in poor performance. To improve the performance, I've created an indexed view and I see a significant improvement in the performance o
Setup: Neo4j - 1.9.3 ~7,000 nodes ~1.8 million relationships I have the following cypher query that I would like to improve the performance on: START a=node(2) MATCH (a)-[:knowledge]-(x)-[:depends]-
I am trying to process data in parallel using ipython's parallel processing. I am following the instructions by @minrk in answer to the question on how to get intermidiate results in ipython parallel
I learn methods of improve performance a code. For study that I wrote function wich return basic descriptive statistic like a psych::describe. I've tried different versions of the loops and at the mom
I have working activity with 12 spinners that are linked to a single database table of over 20,000 records. Each spinner is bound to a different query to make the selections dynamic (based upon the pr
I have been taking a look on some sites and they all talk about using tag selectors instead of classes to improve the performance. For example, this: $(input.myclass); Instead of this: $(.myclass
I have a method that collect shares information and write results to database, I use Parallel.Foreach which increased the performance especially if scanning 100 TB If i run this code in my local data
I am working on a Android application using real-time OCR. I using OpenCV and Tesseract Library. But the performance is very poor, even on my Galaxy SIII. There are any methods to improve the performa
I have a method that has to execute sql scripts for many times. These scripts use for create tables, views, stored procedures, and functions on the database. I came up with this code which works fine