Collection Library for GNU Objective-C To Do list, and Questions. ***************************************************************** I would greatly appreciate your feedback on the questions below. Please email your thoughts to mccallum@cs.rochester.edu. To Do ===== * Fix tonnes of bugs. (Metric because they are bigger, as are my bugs, I'm sure.) * finish RBTree, especially -removeElement: Finish SplayTree. * Address the quesions in the next section below. * Test the inclusion of double's in the elt union. * Write a good hash function for floats and doubles. This from libg++.texi: `unsigned int foldhash(double x);' a hash function for doubles that exclusive-or's the first and second words of x, returning the result as an integer. * Many implementations could be made more efficient. Libcoll hasn't been efficiency tuned at all. Overridding some methods in certain classes could make things more efficient (especially EltNodeCollector). * Fix all the subclassResponsibility comments in *.h * Add more classes? * Redo archiving (waiting for Stream objects from Kresten). Consider ownership issues. If content object conforms to protocol, do the right thing. Support a good "-printOn:" convention, ala Smalltalk. * Write more programs for testing libcoll; clean up testing. The programs should verify their own results. * Fix installation procedure (I just put it in, and it's my first time using autoconf. I'm libel to have done some silly things. Also I need to change the treatment of tests dir. Also, some places in the source that need to check config information don't.) * Fix the treatment of float in HashTable.m * I will finish libcoll documentation. Questions ========= * The -getNextElement:enumState: scheme for enumerating has terrible potential for misuse leading to memory leaks. Currently several classes will induce memory leaks if the programmer stops the enumeration before all elements have been 'gotten'. I'm open to ideas for alternative implementations. * Does anyone really miss the ability to set the comparison function independent of the element encoding? If it's really important we can come up with ways to do this and still be able to archive comparison functions. * Should we keep the -safe... enumeration methods? They sure do add a lot of clutter to the protocols. If we got rid of them people would have to alloc an Array, copy the contents, and free the Array themselves. * What would people think about removing the ...Object methods, and just having the ...Element methods instead? It might be a bit more difficult to use, but it would reduce clutter significantly. The ...Element methods are more efficient to use anyway, since most ...Object methods are wrappers around ...Element calls. I'm not sure what I think we should do. * Perhaps we should put more safety checks into LinkedList, BinaryTree, etc: Check that node is not already part of another collection (by checking that neighbor pointers are nil?) In method "insertObject:newObject after:oldObject" make sure that oldObject is a member of self. ...It seems that there is a lot of potential for nasty bugs with mistakes like these. * What about "Function Names:: Printable strings which are the name of the current function." This is a menu item in gcc.info-6. Could this make a difference to archiving comparison functions? Where are the details on this? * HashTable.m (-initKeyDesc:valueDesc:capacity:) I tried to make it portable, but I didn't try very hard. Did I do it right? * I fixed -emptyCopy in all the subclasses, but the -emptyCopy scheme seems pretty fagile. How about calling -initFoo: inside -emptyCopy? This way we avoid having yet another method in which instance vars must be initialized to some consistent state. -allocCopy would never even get called. <> * The situation with LinkedListNode and LinkedListEltNode, (and the analagous classes for BinaryTree's and RBTree's) are just crying out for multiple inheritance. Anyone have ideas that are prettier than my quick fix using #include ? * Somes classes, like RBTree, depend on a certain ordering of elements to maintain internal consistency. How should we define the behavior of methods whose names imply that the user can set the ordering of elements independent of these constraints? (like -appendElement: or -insertElement:atIndex: for instance). I see three possibilities: 1. The methods do what they say. Give the user the power to override the default ordering. 2. The methods do not do what they say, but call addElement: instead. This lets the class maintain its internal consistency, but it has the potential to be a bit confusing to the user. 3. The methods trigger a -shouldNotImplement: error. This solution perhaps best expresses the reality of the situation, but it's a bit strange for a class to signal an error on a method which is in a protocol the class conforms to. Currently I'm using solution #1 (in most classes?). * I created libcoll.texi by copying libg++.texi. Some of the text is taken verbatim. Is this a problem? * If you don't like the organization of the documentation and you have suggestions for changes, please say so now, not after it's all been written. * Something like this needed? - elementDidChange: (elt*)elementPtr; Currently you have to remove, change, add, for some classes. Questions leftover from last release ==================================== * Opinions on the error reporting scheme? See also tests/test5.m. * NeXT prefixes all their names with "NX" to prevent name conflicts. Should GNU be doing something similar? I'd rather use a cleaner solution. (This has been discussed in email.) * Should I include _comparison_function as an instance variable of Collection? Putting it back in could make some things more efficient, but several classes don't have configurable comparisonFunction's (i.e. String, LinkedList, BinaryTree, RBTree), so they don't need it. * In the current implementation of BinaryTree (and RBTree and SplayTree) the -addElement: method puts the element in sorted order, HOWEVER, the user has the capability of putting elements in unsorted order by sending -insertElement:before: -insertElement:after: or -insertElement:atIndex:. Should I make these three methods call -addElement: instead? The methods wouldn't do what their names implied, but it would ensure sorted order was preserved. * I've been told that GNU filenames should be 14 chars or less. I don't want to abbreviate my classnames, but I think using .h @interface files with names different than the class name is even worse. ** I want to keep my unabbreviated filenames!! ** What to do, what to do... I can't believe that *all* GNU classnames will be limited to 12 characters forever and ever--disgusting. Changes I'd like in the Objective-C runtime and gcc: ==================================================== (See collstd.[hm].) * Make OBJC_MALLOC() and friends public. Have the runtime and Object.m use them. See collstd.[hm]. * Give hash.[hc] functionality more like collhash.[hc], i.e.: Add hash_node_for_key() to hash.h and hash.c. Change hash_next() so that more than one enumeration of the contents can happen concurrently. How about removing cache as a parameter to the hashing functions in hash.h and hash.c. Do the masking on the result of the hash function. This would seem much neater.