• 0

c# Short URL from GUID


Question

Hello gang,

 

I am working on a new project for a site and I would like to implement short urls.  Historically I have used GUIDs as table ids so that replication is not an issue.  So, now I'm looking at creating a short url for these values, the thing is a shortened guid is not that short (vs a shortened Int) Before getting too far down a path, I thought I'd ask if anyone had any thoughts.

 

Thanks

Link to comment
https://www.neowin.net/forum/topic/1203663-c-short-url-from-guid/
Share on other sites

16 answers to this question

Recommended Posts

  • 0

Just generate strings of whatever size you consider to be short, made up of numbers, uppercase letters and lowercase letters, and associated them with your ID field.

 

[short_url_assoc_table]

table_id : GUID

short_url : string

  • 0
  On 06/03/2014 at 20:00, firey said:

What is it you are trying to do?

Like you have something say:

guasd3 = www.google.ca

-3idjqis = www.neowin.net

or?  

I guess I don't understand what you are trying to do exactly.

 

 

I need to use GUIDs as the table ident fields for the issue of replication.  If I use numeric values for the identity, which is easy to convert to a short URL by using Base64 (quite a number of examples around the net for doing this)  however if I use numeric I'm going to have collisions when multiple machines are making new records.  When I have looked at making a short url from a guid, the value is shorter than the GUID, but still longer than the average short GUID.

 

  On 06/03/2014 at 20:32, virtorio said:

Just generate strings of whatever size you consider to be short, made up of numbers, uppercase letters and lowercase letters, and associated them with your ID field.

 

[short_url_assoc_table]

table_id : GUID

short_url : string

 

Interesting idea, but this could be an issue with replication.  I am also concerned about multiple identity fields (waste of time, space, etc)   Thanks though

  • 0
  On 06/03/2014 at 21:16, James Rose said:

 

 

Interesting idea, but this could be an issue with replication.  I am also concerned about multiple identity fields (waste of time, space, etc)   Thanks though

 

What exactly is being replicated?

 

 

  On 06/03/2014 at 21:16, James Rose said:

I need to use GUIDs as the table ident fields for the issue of replication.  If I use numeric values for the identity, which is easy to convert to a short URL by using Base64 (quite a number of examples around the net for doing this)  however if I use numeric I'm going to have collisions when multiple machines are making new records.  When I have looked at making a short url from a guid, the value is shorter than the GUID, but still longer than the average short GUID.

What am I missing here? This shouldn't be an issue with a relational database. You don't tell your database what the ID of a row is, you let the database decide when it inserts the row(s).

  • 0

Im not sure where replication is happening.  Also why do you have to use guids for the identity, why not just use an incrementing number, and hash it's value or something to get the url you want to use?  Also, the database should be able to handle the ID itself.. and there should never be an issue of too many inserts causing problems.

  • Like 1
  • 0

Okay gang,

 

Replication of tables between multiple servers cannot use numeric values as, for example in SQL Server the Identity value is incrimented by 1 (and yes, you can change this value, but it wouldn't help when the app needs to scale)  Imagine two servers, each one adding new values to a table; "Customers" One each server they would both get identity #1 for the first record, when the two servers attempt to merge (ever hour, every minute, whenever) there would be a collision since there would be two records with the same identity value.  Using GUIDs for the key field avoids this issue as it is, almost, impossible to have the same guid twice.

 

 

Thanks Asik, however the "guidAsString" variable is still too long to be a short url.

  • 0

Looks like I found the answer.

                //Guid guid = Guid.NewGuid();
                string sGUID = "a33d4a21-7d95-41f7-859e-bf02b2fda650"
                string hashCode = String.Format("{0:X}", sGUID.GetHashCode());
                Console.WriteLine(hashCode);

This makes a nice small url, can anyone think of why this should not be used?

  • 0

^you shouldn't use it for the reasons listed here (about uniqueness guarantees and differences between versions): http://stackoverflow.com/questions/7458139/net-is-type-gethashcode-guaranteed-to-be-unique

 

Hash the GUID using SHA1 and truncate it or something like that. That's probably the best you are going to do. (perhaps you will have to truncate it to much, forcing a too high of probability for collision -- you should check the probability).

 

EDIT: Oh also, if you encode the result of hashing in a higher base, you reduce the amount of information loss during truncation.

  • 0
  On 06/03/2014 at 22:13, snaphat (Myles Landwehr) said:

^you shouldn't use it for the reasons listed here (about uniqueness guarantees and differences between versions): http://stackoverflow.com/questions/7458139/net-is-type-gethashcode-guaranteed-to-be-unique

 

Hash the GUID using SHA1 and truncate it or something like that. That's probably the best you are going to do. (perhaps you will have to truncate it to much, forcing a too high of probability for collision -- you should check the probability).

 

EDIT: Oh also, if you encode the result of hashing in a higher base, you reduce the amount of information loss during truncation.

 

EDIT:  Maybe what the article is saying, and what you are trying to tell me is that two different GUIDs could return the same hex?

 

 

 

Pardon me if I appear dense; I just read that article and ran some test against the same guid value for 1 billion iterations and it always comes up with the same hex.  I understand that 1 billion isn't necessarily that large a number...  what I am asking is shouldn't the hex value for a specific string always return the same hex value. 

 

Quote: "does not guarantee unique return values for different objects."  Since the app will pull the guid from the db, and then issue a hex on demand wouldn't that value always be the same?

 

Thanks for your input

  • 0
  On 06/03/2014 at 22:25, James Rose said:

Pardon me if I appear dense; I just read that article and ran some test against the same guid value for 1 billion iterations and it always comes up with the same hex.  I understand that 1 billion isn't necessarily that large a number...  what I am asking is shouldn't the hex value for a specific string always return the same hex value. 

 

Quote: "does not guarantee unique return values for different objects."  Since the app will pull the guid from the db, and then issue a hex on demand wouldn't that value always be the same?

 

Thanks for your input

 

It's the same each time because you are using the same version of the .net runtime on the same object for each run so it's producing the same hash. What they are saying is really two things: (1) if you switch versions of the .net runtime (e.g. 3.5 to 4), the returned result can be different for the same object, and (2) and within the same version of the runtime (e.g. 4) there can be collisions in hashes between different objects. There are no uniqueness guarantees.

 

So for example GUID_A.getHashCode() can return different results if you switch .net runtimes. And GUID_B.getHashCode() and GUID_C.getHashCode() could return the same result in the same runtime.

  • 0
  On 06/03/2014 at 22:31, snaphat (Myles Landwehr) said:

So for example GUID_A.getHashCode() can return different results if you switch .net runtimes. And GUID_B.getHashCode() and GUID_C.getHashCode() could return the same result in the same runtime.

 

yea, this is the answer I finally got to (see my edit above).  I was having a very hard time getting to the idea that a 30+ char piece of data could reliably be set to a shorter value.

  • 0
  On 06/03/2014 at 22:39, James Rose said:

yea, this is the answer I finally got to (see my edit above).  I was having a very hard time getting to the idea that a 30+ char piece of data could reliably be set to a shorter value.

Well in any case, you should re-encode whatever you do use to a higher number base that is still valid as url characters. For example, as I was saying before if you do the following you can store more information of your hash in less characters. 

String result=re_encode_as_base_X(SHA1_hash(GUID), N) //base 16 --> base N

I think at the end of the day, you will have to truncate though regardless of what you do. 

  • 0

It turns out I may be suffering from "doing this too long" desease.  Someone was kind enough to send a private message to me that the issue of replication on numeric  idenities may no longer be the issue it used to be.

 

I'm reading this article now:  http://technet.microsoft.com/en-us/library/ms146907%28v=sql.105%29.aspx

  • Like 1
  • 0
  On 06/03/2014 at 22:01, James Rose said:

Thanks Asik, however the "guidAsString" variable is still too long to be a short url.

I was suggesting taking the BigInteger and passing it through whatever method you mentionned that converted numerical values into short URLs, not taking it as a string directly. Anyway, looks like you found your answer.

This topic is now closed to further replies.
  • Posts

    • glad we have more options but im sticking with playnite i know whatever implementation MS does will be half baked
    • ...teasing that it may be becoming a universal launcher like GOG Galaxy or Playnite. Can I respectfully ask that they go and do one? I don't need another launcher.
    • same. i bought my sister HP 440 laptop with super bright display for a half price of mac book. installed LTSC and set EU region so no preinstalled shlt apps
    • Lenovo announces the most powerful ARM-based Chromebook with an OLED display by Pradeep Viswanathan Lenovo today announced the Lenovo Chromebook Plus 14, its most powerful ARM-based Chromebook. This Chromebook is powered by the MediaTek Kompanio Ultra 910 processor, which features an NPU that can deliver up to 50 TOPS of AI performance. The Chromebook Plus 14 comes with a 14-inch OLED display, with optional touchscreen models. Customers can customize this laptop with up to 16 GB of RAM based on their performance needs. Thanks to the power-efficient SoC, Lenovo claims that this Chromebook can last up to 17 hours on a single charge, the longest battery life on a Chromebook Plus. The Lenovo Chromebook Plus 14 is now available for purchase in the US, starting at $649 from Best Buy and Lenovo’s official website. To make the purchase more valuable, Google is offering a one-year subscription to its Google AI Pro plan (a $240 value) with every Chromebook Plus. To take advantage of the powerful on-device AI capabilities of the Lenovo Chromebook Plus 14, Google is releasing the following two exclusive AI features: Smart grouping: Users can use AI to organize open Chrome tabs and documents into logical groups. Image editing in the Gallery app: The Gallery app can be used to remove backgrounds, make stickers, and more. Apart from the above exclusive features, Google is also releasing the following updates to all Chromebook Plus models starting today: Select to search & Text capture: A Google Lens-like capability is now available on Chromebooks. Users can just long-press the on-screen launcher button or use the screenshot tool to select anything on their screen for instant Google Search results. Users can also use the new "Text capture" to automatically extract text from images and send it to Google Workspace apps or calendars as editable text. The Quick Insert (QI) key, which was introduced earlier this year, now allows users to easily generate images using AI in addition to its existing capabilities. The new "simplify" feature within "Help me read" will help students convert complex language into more understandable content. Google’s popular NotebookLM research and note-taking app is now pre-installed on every Chromebook Plus. Netflix’s popular Squid Game: Unleashed game is coming to Chromebooks as an optimized desktop app with keyboard and mouse controls and some exclusive in-game items, including skins. With its high‑performance, premium hardware and advanced AI features, the new Lenovo Chromebook Plus 14 is trying to position itself as a strong contender against Windows laptops in the premium segment.
  • Recent Achievements

    • Week One Done
      fredss earned a badge
      Week One Done
    • Dedicated
      fabioc earned a badge
      Dedicated
    • One Month Later
      GoForma earned a badge
      One Month Later
    • Week One Done
      GoForma earned a badge
      Week One Done
    • Week One Done
      ravenmanNE earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      651
    2. 2
      Michael Scrip
      226
    3. 3
      ATLien_0
      219
    4. 4
      +FloatingFatMan
      143
    5. 5
      Xenon
      137
  • Tell a friend

    Love Neowin? Tell a friend!