Thursday, March 01, 2007

OraSRP author replies

Note to subscribers: If your reader does not properly display some formatted content like bold and italics (I know that at the very least, Bloglines has this problem), I suggest you visit the blog site itself.



First of all, I would like to thank OraSRP's author, Egor Starostin, for dropping by and making a comment on yesterday's blog entry, OraSRP v3.0b - a very short initial impression. I've decided to put my reply in a separate blog entry so that readers who do not subscribe to comment feeds can pick it up.

With regards to my observation that the number of lines that OraSRP reports is not equal to the actual number of lines in the raw trace file, Egor says:

OraSRP ignores some lines like
'===' or '*** 2007-02-28 12:45:08.424' and doesn't report them as 'processed'. So, 'Total lines processed in trace file' != 'Total lines in trace file'.


That's good to know, and I suspected that much. But without that clarification, a different line count may mean that there are some relevant lines that are being missed. And when relevant lines are missed, it then becomes possible that the output may be affected to a point that it no longer presents a correct profile. When the profile is not correct, then that may lead people to try to solve problems that are not the real ones to begin with. This is very important to me because the real strength of a correct profile, at least as far as I'm concerned, is its ability to pinpoint the root cause of a problem.


Of course, I have no problems bypassing irrelevant lines and since this is already being done, I respectfully suggest that the number of lines in the raw trace file still be reported, even as a separate line item, so that people don't get thrown off by differing numbers.


Now, let's double check the numbers and take into account the types of lines that Egor mentioned above. Let me deal with the '^===' lines first:

jforonda@u20$ egrep '^===' 4071.trc | head -5
=====================
=====================
=====================
=====================
=====================
jforonda@u20$


That looks good enough. Now, let's count them:


jforonda@u20$ egrep '^===' 4071.trc | wc -l
2165
jforonda@u20$


So we have 2,165 lines that fit this criteria.


For the date/timestamp lines:


jforonda@u20$ egrep '\*\*\* 200' 4071.trc | head -5
*** 2004-10-18 19:51:28.453
*** 2004-10-18 19:55:50.302
*** 2004-10-18 19:56:07.899
*** 2004-10-18 19:56:48.561
*** 2004-10-18 19:57:41.135
jforonda@u20$

jforonda@u20$ egrep '\*\*\* 200' 4071.trc | wc -l
106
jforonda@u20$


That's another 106 lines. If we add these two numbers to the total reported by OraSRP, we get a total of 2,152,611:

jforonda@u20$ bc
2150340+106+2165
2152611
jforonda@u20$

This means that there are still 4,239 lines that are not accounted for:

jforonda@u20$ bc
2156850-2152611
4239
jforonda@u20$



This is still a lot of lines that are not unaccounted for. Now, Egor's comment said "OraSRP ignores some lines like..." and that makes me think that there are other types of lines, aside from those that were already mentioned, that are being bypassed on purpose. If Egor gives me a complete list of the types of lines that are being bypassed on purpose, I'd be happy to check my trace file.


With regards to the different totals between the “with idle” and "without idle” runs, Egor said:

As for different times with/without idle waits recognition -- I believe it's a bug. Unfortunatey, I can't reproduce it yet, so I can't fix it too.

In as much as I want to help, I will say now that I can not give this trace file to Egor. As most people reading this probably know, 10046 trace files contain confidential information. This is also the reason why I crossed out the instance and node names from the screenshots that I posted yesterday. If there are readers who can share their trace files to Egor, I suggest that they try to do what I've done here and see if they can reproduce the problems that I have described.


Now, there's still that issue about the total elapsed time but I have to attend to other things so I will stop for now. I will post another entry to talk about the elapsed time within the next day or so. If you want to follow this topic, it will make it easier if you subscribe. :)