[Silas] - 2005-12-8 - The objective of this page is to discuss alghorithms to help in "query synthesys" (I don't know if "synthesys" and "synthesize" are the best word to discuss a stuff like that. If not, please, rename or remove this page). Suppose you have a list with redundat data: {Owner Car} {Silas Ferrari} {Silas Porche} {John Honda} {Smith Renaut} {Smith Ford} See this list has redundant data. If I want to show it in a report, it probably be confuse to the viewer. So it's a good idea to take off this junk data: {Owner Car} {Silas Ferrari} {{} Porche} {John Honda} {Smith Renaut} {{} Ford} This kind of information is common from queries with JOINs. I developed a cool proc which takes off this junk data. It is in version 0.1 yet, because I'm still trying it... not 100% stable. Needs some fixes. I'll update it as I note errors. For now, it conforms to my purposes: '''Hey! [RS]'s code (or the modified version) on the bottom is the best way to achieve the objective. Should we take the proc below off here?''' proc synthesize {List reference Fields} { # synthesize: Receive a list of lists and synthesizes a query with rows # returned by a JOIN. Returns the synthesized list # # List: List to be synthesized # reference: fields to be compared # Fields: Fields which will be used in synthesize. It's a list of indexes # version 0.1 - 2005-12-06 # by [Silas] Justiniano - silasdb at gmail dot com set i 0 while {$i < [llength $List]} { for {set k 0} {$k < [llength [lindex $List 0]]} {incr k} { set aux($k) {} } set j [expr $i+1] while {[lindex $List $i $reference] eq [lindex $List $j $reference]} { for {set k 0} {$k < [llength [lindex $List 0]]} {incr k} { if {[lsearch $aux($k) [lindex $List $j $k]] == -1 && [lsearch $Fields $k] != -1} { lappend aux($k) [lindex $List $j $k] } } incr j if {[expr $j-1] == [llength $List]} {return $List} } set max 0 foreach valor [array names aux] { if {[lindex $aux($valor) 0] == [lindex [lindex $List $i] $valor]} { set aux($valor) [lreplace aux($valor) 0 0] } if {$max < [llength $aux($valor)]} { set max [llength $aux($valor)] } } for {set k $i} {$k <= [expr $max+$i]} {incr k} { for {set l 0} {$l < [llength [lindex $List 0]]} {incr l} { if {$k == $i} { lappend Itens [lindex [lindex $List $i] $l] } else { if {[lsearch $Fields $l] == -1} { lappend Itens {} } else { lappend Itens [lindex $aux($l) 0] set aux($l) [lreplace $aux($l) 0 0] } } } set List [linsert $List $k $Itens] unset -nocomplain Itens } set List [lreplace $List [expr $i+$max+1] [expr $j+$max]] if {$max == 0 && $i == [llength $List]-1} break incr i [expr $max+1] } return $List } #Testing...: set A { {Silas Ferrari} {Silas Porche} {John Honda} {Smith Renaut} {Smith Ford} } puts [synthesize $A 0 1] Notes the first argument is the list, the second is the "field" that you will use as reference to synthesize and the last is a list of indexes to be deleted when found more than one (is this explanation fine? :) ). As I said, the last argument can receive a list of indexes. So in the following list: {Owner Car Color} {Silas Porche Blue} {Silas Porche Black} {Silas Ferrari Red} {John Honda Blue} {Smith Renaut White} {Smith Renaut Gray} {Smith Ford Green} Each owner has a car, some have more than one, some of them are not different, but have different color. A synthesized list would be: {Owner Car Color} {Silas Porche Blue} {{} {} Black} {{} Ferrari Red} {John Honda Blue} {Smith Renaut White} {{} {} Gray} {{} Ford Green} I reached it using: puts [synthesize $A 0 {1 2}] Please, help to improve this proc. ---- [RS] experimented with this (which has no column-width formatting): proc synthesize llist { set last {} set ress {} foreach list $llist { set reslist {} foreach e $list f $last { lappend reslist [expr {$e eq $f? {}: $e}] } lappend ress $reslist set last $list } set ress } Testing: set data { {Owner Car Color} {Silas Porche Blue} {Silas Porche Black} {Silas Ferrari Red} {John Honda Blue} {Smith Renaut White} {Smith Renaut Gray} {Smith Ford Green} } % join [synthesize $data] \n Owner Car Color Silas Porche Blue {} {} Black {} Ferrari Red John Honda Blue Smith Renaut White {} {} Gray {} Ford Green ---- [Silas] - wow! [RS]'s code is really better than mine! It requires a list whose repeated values comes firstly. For example: {Kevin Honda Green} {Kevin Ford Green} "Kevin" is the repeated value here. "Green" is repeated too, but it is a property of the car, which is not repeated. Unfortunatelly [RS]'s code doesn't works fine with this kind of list: % synthesize {{Kevin Honda Green} {Kevin Ford Green}} {Kevin Honda Green} {{} Ford {} So I modified [RS]'s code: proc synthesize llist { set last {} set ress {} foreach list $llist { set reslist {} set found 0 foreach e $list f $last { if {$e eq $f && $found == 0} { lappend reslist {} } else { lappend reslist $e set found 1 } } lappend ress $reslist set last $list } return $ress } #Testing...: % puts [synthesize {{Kevin Honda Green} {Kevin Ford Green}}] {Kevin Honda Green} {{} Ford Green} Although it requires a list whose "repeated" values comes firstly, I think this [RS]'s modified code is much better than my first synthesize proc. It's easier, better and faster to get a list in this format directly from SQL and re-order it later than try to address the problem with a "non-ordered" list in this format. ---- [Silas] - 2005-12-09 - I noticed that queries results have a interesting organization form. They looks like a [fractal]. See: A B D A B E A C D A C E It's the basic form of a query with two join. So: SELECT Books.name, Subjects.name, Authors.name FROM Books INNER JOIN Books_Subject ON Books.book_id = Books_Subject.book_id INNER JOIN Subjects ON Books_Subject.subject_id = Subjects.subject_id INNER JOIN Books_Authors ON Books.book_id = Books_Authors.book_id INNER JOIN Authors ON Books_Authors.author_id = Authors.author_id; I think many-to-many (using a intermediate table) relationships are the best example here. The query above would return (for one book inserted in the database): {{The comunist manifest} {Politics} {Karl Marx}} {{The comunist manifest} {Politics} {Friedrich Engels}} {{The comunist manifest} {History} {Karl Marx}} {{The comunist manifest} {History} {Friedrich Engels}} [RS]'s modified code would return: {{The comunist manifest} Politics {Karl Marx}} {{} {} {Friedrich Engels}} {{} History {Karl Marx}} {{} {} {Friedrich}} It's still repeating some values. The next proc, makes it better: proc synthesize List { for {set i 0} {$i < [llength $List]} {incr i} { set First [lindex $List $i] set j [expr $i+1] while {[lindex $First 0] eq [lindex $List $j 0]} { for {set k 0} {$k < [llength $First]} {incr k} { if ![info exists aux($k)] {set aux($k) {}} if {[lsearch $aux($k) [lindex $List $j $k]] == -1} { lappend aux($k) [lindex $List $j $k] } } incr j } if {$j > $i+1} { set List [lreplace $List $i $j-1] set max 0 foreach valor [array names aux] { if {[llength $aux($valor)] > $max} {set max [llength $aux($valor)]} } for {set j 0} {$j < $max} {incr j} { for {set k 0} {$k < [llength $First]} {incr k} { lappend Items [lindex $aux($k) 0] set aux($k) [lreplace $aux($k) 0 0] } set List [linsert $List [expr $i+1] $Items] unset Items } } incr i [expr $j-1] } return $List } #Testing...: % synthesize $A {{The comunist manifest} Politics {Friedrich Engels}} {{} History {Karl Marx} It's another big proc (like [RS], I don't like big procs). Probably it can be improved and diminished... if someone has time, please, work on it. ---- [Category Algorithm]