View Full Version : Lexicographic order?

07-02-2002, 11:20 PM
Does anyone know how to sort an array of vertices in lexicographic order?



PS: I have already tried www.google.com. (http://www.google.com.) http://www.opengl.org/discussion_boards/ubb/smile.gif

07-03-2002, 01:42 AM
You're joking right? Lexicographical order means sorted alphabetically: Adrian, John, Kenneth, Lisa etc. Why you would want to sort an array of vertices (points in 3space) lexicographically is beyond me. How you could determine the the lexicographical order of anything but words and letters is beyond me. But maybe you mean someting different? Or if you really want to do this, map the bytes of the vecotrs to chars and treat them as words. You'd get into trouble with non aplhanumeric characters though, so it probably wouldn't work either.

07-03-2002, 01:49 AM
I think thats what he seems to mean. Convert 3.14 to "three point one four" and then do an ordinary quick sort http://www.opengl.org/discussion_boards/ubb/wink.gif

07-03-2002, 03:17 AM
Try this here:

07-03-2002, 06:30 AM
I just came to the conclusion that my link was not as useful as it should have been - sorry about that.

If there are two vectors A and B each with float (or double) components (x,y,z) then the lexicographic order is:

A < B iff

A.x < B.x Or
A.x == B.x And A.y < B.y Or
A.x == B.x And A.y == B.y Or A.z < B.z

07-03-2002, 01:11 PM
That's what the link said, except for the symbols you used and the link wasn't editted clearly.

Why would someone want to sort this way?


07-03-2002, 02:03 PM
Originally posted by V-man:
Why would someone want to sort this way?

Don't know about this particular way, but some sort of ordering predicate is necessary for building efficient searchable collections, which might come in handy. Think about processing a big bunch of coordinate values looking for duplicates, so that you can convert the data to a more economical indexed form.

07-04-2002, 05:38 AM
Exactly. I think the objective is to sort vertices to search more effectively dupplicate entities, connectivity, curvatures, etc.

07-04-2002, 11:58 AM

That's a good idea actually.


Kilam Malik
07-05-2002, 12:05 AM
I use hashing to determine if a point matches an other point in the hash table. See my answer in:


Instead of making one hash value for the point you could hash each component (x, y and z) one after another. So you have a hash table for x, inside this a hash table for y and inside this a hash table for z.


[This message has been edited by Kilam Malik (edited 07-05-2002).]

07-05-2002, 11:10 PM
Why sort lexicographically and not just by value?

07-06-2002, 02:14 AM
Hashing is probably a good idea if you want to accelerate e.g. lookup of neighbouring vertices. The average complexity of a hash lookup is O(1) although the worst case is O(n). Storing stuff sorted in say a tree will give you better worst case complexity, but worse average lookup time. Provided you have a good enough hash function the hash lookup should perform better. I don't know of a good hashfunction for 3 component float vectors though...

Kilam Malik
07-07-2002, 09:33 PM
Basically I use

x + 1000 * y + 1000000 * z MODULO buckets;

Buckets is a prim value.

Important is, that if a value is 1.00000 and the next you search is 0.99999 you won't find it. To avoid this, I search in the bucket my hash function gives me and also one bucket below and above. This means O(3).

Another possibility I didn't try yet is to cascade the hashings. This means, you hash every componets of the vector seperately. This leads you to the next hash table. So you hash x then y then z. You have also O(3) then.


[This message has been edited by Kilam Malik (edited 07-07-2002).]

07-07-2002, 11:13 PM

What is a half-edge datastructure?

Kilam Malik
07-08-2002, 12:39 AM

it is a datastructure to store the geometry _and_ the topology of a 3D body.

Basically, you have a vertex list (no duplicates, here I use the hashing). Next is the edge list. Each edge refers to it's two vertices. Also, each edge referes to two half edges, one on the left and one on the right side of the edge. This half edge referes to the loop and to to face.

This way, you have all relationships of the faces, loops, edges and vertices. E.g. you could find easily all faces which touch a vertex etc. Mostly, I need it to determine the left and right face of an edge.

Also, this way the topology is distinct from the geometry. The only point, where real coordinates are used is in the vertices. Every other elements are only refering to the vertices. This means, if you move a vertex, all faces change properly.

Another point is, that the edge must not be linear. It could be of any shape.

I use it e.g. to load STL files and automatically smooth them by determining the angle at which two faces meet at an edge.

Just search for "half edge" on google, it should give you a lot of results.


07-08-2002, 08:45 AM
[This message has been edited by folker (edited 07-08-2002).]

07-10-2002, 02:41 AM

How does the hash table work?

Do you sort vertices by hval?

[This message has been edited by billy (edited 07-10-2002).]

Kilam Malik
07-10-2002, 03:51 AM
You have a number of buckets, let's say 17. In each bucket you have a list of vertices. Now you have a function which tells you the hVal of a vertex between 0 and 16. Thats the list to which you append the vertex.

With a good choosen hashing function you get a normal distribution over the buckets. This means, you need O(2) if you have 34 vertices in your hash list to find it.

The number of buckets should be a prim number and high enough for your application. I have used 16381.

For more information, just search the web. Hashing is a standard technique which can index everything you could find a hash value for.


[This message has been edited by Kilam Malik (edited 07-10-2002).]

07-10-2002, 05:06 AM
this is just nit-picking, but O(2) and O(3) are the same thing as O(1) since they only differ by a constant. O(1) just means "constant time", that time however, might be 4000 years. The only point is, given a sufficiently large n, an algorithm with asymptotic (sp?) complexity O(1) will be faster than one with complexity O(n). There's always this little invisible constant you have to remember ;-)

07-10-2002, 05:58 AM
actually O(x) means it takes x times one step. for the hashtable example, with a good hashkey generator, it takes for twice the amount of objects stored in more or less 2 steps, so O(2) is good.

remember, there is a wellknown sorting algo.. it has O(4*k)...

Kilam Malik
07-10-2002, 08:16 AM
Ok, it was not the correct notation of O(). Best case is O(1), worst case is O(n).

I have forgotten to mention one thing in this context (which I mentioned in a post above):

You have to search 3 of the buckets. The bucket you find the hash value for and the bucket below and above. The reason for this is, that a vertex could be 1, 1, 1 and the one you want to find is 0.99999, 0.99999, 0.99999. Those are in two different buckets, but luckily in the neighbouring ones.

If you use my hash function from above, the points must be exact to the 6th number behind the decimal.


[This message has been edited by Kilam Malik (edited 07-10-2002).]

07-10-2002, 10:58 PM
Also (-1,-1,-1) will have the same hash value as (1,1,1), right?

ABS(-1 + 1000 * (-1) + 10000000 * (-1)) = 10001001

ABS(1 + 1000 * (1) + 10000000 * (1)) = 10001001

Kilam Malik
07-10-2002, 11:12 PM
Yes, those will be mapped on the same bucket. This could be a problem for symetric parts. But only if all values are negated. E.g.

1 1 1 and -1 1 1 lies in different buckets.


07-10-2002, 11:19 PM
Whay don't you translate all coordinates to positive xyz space?

Also, shouldn't the number of buckets vary with the number of vertices and the hash function with the dimension of the mesh?

What is your opinion on this, kilam?

Kilam Malik
07-11-2002, 12:23 PM
Originally posted by billy:
Whay don't you translate all coordinates to positive xyz space?

Also, shouldn't the number of buckets vary with the number of vertices and the hash function with the dimension of the mesh?

What is your opinion on this, kilam?

1) With abs() all coordinates are in positive space. You have to do this, otherwise you get a negative index for the buckets.

2) You could optimize the function by doing that. As I'm reading STL files with this, I would have to read the file two times: First time to count the triangles, then initilize the hash buckets with an appropriate number and the read the vertices. For the models I use, this fixed number is ok.

3) For the hash function this is the same, I would have to read and analyze the file before I could make good hash function for this data. As the highes multiplier is 1 Million, I could reach all buckets, if the model is at least 16.381 / 1.000.000 big, which is 0.01. This is also enough for my models.

If you have models which vary very strong in number of points and size, it is of course a good idea. But both has the problem of reading the file two times, or reading it into a second datastructure, analyzing it and then transfering the points to the hash table.

Kilam Malik.

[This message has been edited by Kilam Malik (edited 07-11-2002).]