Robert Prins
2018-08-15 21:37:41 UTC
Hi all,
Not really specific x86, Virtual Pascal, or PL/I, but as I've got the same code
written in all three languages, I'll give it a try here:
I have an array (actually three, but that's not very relevant for the issue) of
pointers to items in a linked list. The array is sorted into a specific order,
and after the sort, I use it to print Top-N tables. For the main Top-50 table (x
3) that's easy enough, the first 50 items are selected.
However, at this time the linked list also contains 5 sub-lists, each containing
additional sub-lists for a total of 390 (yes, three-hunderd-and-ninety)
sub-lists for which I want to print Top-10 tables (or if a sub-list contains
just 4 items, a Top-4 table). Because I don't really want to sort the main array
six (x 3) times, I just scan the array from the top, and pick out the items for
each sub-list, until I have 10 (or the 4 mentioned before) items. To speed up
the process a little, I make an initial pass over the array to find the first
item for each sub-list, but that doesn't really improve things a great deal.
So given that "this" is the list, sorted on its main key, with the second and
third columns identifying additional sub-lists,
100, 'a', '1'
099, 'b', '9'
098, 'c', '1'
097, 'b', '8'
096, 'd', '3'
095, 'a', '2'
094, 'c', '8'
093, 'c', '1'
092, 'a', '3'
091, 'f', '4'
090, 'a', '2'
the "Top-50" would contain the above.
For sub-list 'a', I would scan the array, and the resulting "Top-10" would be
100, 'a', '1'
095, 'a', '2'
092, 'a', '3'
090, 'a', '2'
Likewise, for sub-list 'b', I would scan the array, and the resulting "Top-10"
would be, and the initial fill-the-cache scan would have set the starting index
for the 'b' sub-list to the cell containing 099, saving me one superfluous compare.
099, 'b', '9'
097, 'b', '8'
etc, and the same would happen for the sub-lists defined by column 3, the first
being
100, 'a', '1'
098, 'c', '1'
093, 'c', '1'
As I wrote, I make an initial pass over the data to cache the first item for
each sub-list, some of them consist of a single item and are nearly at the end
of the sorted full array, and this saves some time, the count of executed
statements is reduced by around 10%, but if anyone can think of a way to speed
up things a bit more, without having to sort the array(s) five more times (in an
additional 10,000,000 executed statements, see below), please share it with me.
Some numbers:
- the caching code executes around 120,000 PL/I statements
- sorting the three arrays executes around in 2,000,000 PL/I statements (using a
Heapsort)
- printing the 390 top-n tables executes around 7,600,000 PL/I statements, and
the "IF" statement that selects the data to be printed has a hit-percentage of
just 0.5%:
IF STY_TCN = 'Tr' & STY = LIST.TR | /* 208 type 1 */ 1,841,754 executed IF's
STY_TCN = 'Na' & TCN = LIST.NA | /* 93 type 2 */
STY_TCN = 'Yr' & STY = LIST.YR | /* 37 type 3 */
STY_TCN = 'Co' & TCN = LIST.CO | /* 33 type 4 */
STY_TCN = 'Ty' & TCN = LIST.TY | /* 19 type 5 */
STY_TCN = '**' THEN /* 1 main */
DO;
#T = #T + 1; 9,594 - the actual
number of Top-N lines printed
Robert
Not really specific x86, Virtual Pascal, or PL/I, but as I've got the same code
written in all three languages, I'll give it a try here:
I have an array (actually three, but that's not very relevant for the issue) of
pointers to items in a linked list. The array is sorted into a specific order,
and after the sort, I use it to print Top-N tables. For the main Top-50 table (x
3) that's easy enough, the first 50 items are selected.
However, at this time the linked list also contains 5 sub-lists, each containing
additional sub-lists for a total of 390 (yes, three-hunderd-and-ninety)
sub-lists for which I want to print Top-10 tables (or if a sub-list contains
just 4 items, a Top-4 table). Because I don't really want to sort the main array
six (x 3) times, I just scan the array from the top, and pick out the items for
each sub-list, until I have 10 (or the 4 mentioned before) items. To speed up
the process a little, I make an initial pass over the array to find the first
item for each sub-list, but that doesn't really improve things a great deal.
So given that "this" is the list, sorted on its main key, with the second and
third columns identifying additional sub-lists,
100, 'a', '1'
099, 'b', '9'
098, 'c', '1'
097, 'b', '8'
096, 'd', '3'
095, 'a', '2'
094, 'c', '8'
093, 'c', '1'
092, 'a', '3'
091, 'f', '4'
090, 'a', '2'
the "Top-50" would contain the above.
For sub-list 'a', I would scan the array, and the resulting "Top-10" would be
100, 'a', '1'
095, 'a', '2'
092, 'a', '3'
090, 'a', '2'
Likewise, for sub-list 'b', I would scan the array, and the resulting "Top-10"
would be, and the initial fill-the-cache scan would have set the starting index
for the 'b' sub-list to the cell containing 099, saving me one superfluous compare.
099, 'b', '9'
097, 'b', '8'
etc, and the same would happen for the sub-lists defined by column 3, the first
being
100, 'a', '1'
098, 'c', '1'
093, 'c', '1'
As I wrote, I make an initial pass over the data to cache the first item for
each sub-list, some of them consist of a single item and are nearly at the end
of the sorted full array, and this saves some time, the count of executed
statements is reduced by around 10%, but if anyone can think of a way to speed
up things a bit more, without having to sort the array(s) five more times (in an
additional 10,000,000 executed statements, see below), please share it with me.
Some numbers:
- the caching code executes around 120,000 PL/I statements
- sorting the three arrays executes around in 2,000,000 PL/I statements (using a
Heapsort)
- printing the 390 top-n tables executes around 7,600,000 PL/I statements, and
the "IF" statement that selects the data to be printed has a hit-percentage of
just 0.5%:
IF STY_TCN = 'Tr' & STY = LIST.TR | /* 208 type 1 */ 1,841,754 executed IF's
STY_TCN = 'Na' & TCN = LIST.NA | /* 93 type 2 */
STY_TCN = 'Yr' & STY = LIST.YR | /* 37 type 3 */
STY_TCN = 'Co' & TCN = LIST.CO | /* 33 type 4 */
STY_TCN = 'Ty' & TCN = LIST.TY | /* 19 type 5 */
STY_TCN = '**' THEN /* 1 main */
DO;
#T = #T + 1; 9,594 - the actual
number of Top-N lines printed
Robert
--
Robert AH Prins
robert(a)prino(d)org
Robert AH Prins
robert(a)prino(d)org