• 0

2 Questions about Hash Strings


Question

When I generate a hash (say SHA256 or SHA512) , the hash string is composed of a combination of a-f and 0-9.

 

  1. Is there a way to generate a hash string that is composed of a-z, A-Z and 0-9? 
  2. Is there away to control what characters are used so if I only wanted m-z, A-L, 0-9 and "-_=*^#@!()[]{}<>;:,.?" that would be a possibility?
Link to comment
https://www.neowin.net/forum/topic/1369666-2-questions-about-hash-strings/
Share on other sites

11 answers to this question

Recommended Posts

  • 0

You could base36 encode the hash output to give you a string composed of a-z0-9 (or write a simple custom cipher to map to whatever set of characters you want) but I can't think of a reason why you would want to do this?

  • 0

Understanding why it's needed is my business. But thank you.

 

Secondly ,PHP does this when creating a session. You are able to customize it's sid_bits_per_character to 6, which does a-zA-Z and 0-9; thus I assumed there is a method to specify what values you want to be included as the components of the hash.

  • 0

You assumed wrong. What PHP is doing is completely independent of the hashing method, it's simply taking the bits returned from (any) hashing method and rather than displaying them as a hexadecimal representation it's encoding them into a string using a character set of their choosing, just as I said you could do:

 


static char hexconvtab[] = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,-";

static void bin_to_readable(unsigned char *in, size_t inlen, char *out, size_t outlen, char nbits) /* {{{ */
{
	unsigned char *p, *q;
	unsigned short w;
	int mask;
	int have;

	p = (unsigned char *)in;
	q = (unsigned char *)in + inlen;

	w = 0;
	have = 0;
	mask = (1 << nbits) - 1;

	while (outlen--) {
		if (have < nbits) {
			if (p < q) {
				w |= *p++ << have;
				have += 8;
			} else {
				/* Should never happen. Input must be large enough. */
				ZEND_ASSERT(0);
				break;
			}
		}

		/* consume nbits */
		*out++ = hexconvtab[w & mask];
		w >>= nbits;
		have -= nbits;
	}

	*out = '\0';
}

https://github.com/php/php-src/blob/master/ext/session/session.c#L269

 

 

I asked why because I hope you're not using this for security purposes, based on the fact that you had to ask this question in the first place you're more likely to end up reducing security rather than increasing it. 

Edited by ZakO
  • Like 1
  • Thanks 1
  • 0

1. No you can only have one of A-Z or a-z.  This can be done by encoding the string that your hashing algorithm to something that is hex.

2. No, this defeats the point of a hash.  I can't think of a good reason for doing this.

  • 0
  On 05/08/2018 at 11:20, Fahim S. said:

1. No you can only have one of A-Z or a-z.  This can be done by encoding the string that your hashing algorithm to something that is hex.

2. No, this defeats the point of a hash.  I can't think of a good reason for doing this.

Expand  

Thanks.

It's funny when I make inquiries, rather than answering I am offered personal opinions of understanding.  

 

While you offered some answers, you added your ego (or lack of worldly experience) in to the mix.  You do not need to know why I want something.  The comment of "I can't think of a good reason for doing this" is naive and immature.  Of course you cannot think of a good reason to do this; it's because you haven't lived my life; surely you understand that. But mostly that comment is completely a relevant.

 

In future, just answer the question and don't interject your immaturity in to your response.

 

Cheers mate.

Edited by Brian Miller
  • 0
  On 05/08/2018 at 10:51, ZakO said:

You assumed wrong. What PHP is doing is completely independent of the hashing method, it's simply taking the bits returned from (any) hashing method and rather than displaying them as a hexadecimal representation it's encoding them into a string using a character set of their choosing, just as I said you could do:

  


static char hexconvtab[] = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ,-";

static void bin_to_readable(unsigned char *in, size_t inlen, char *out, size_t outlen, char nbits) /* {{{ */
{
	unsigned char *p, *q;
	unsigned short w;
	int mask;
	int have;

	p = (unsigned char *)in;
	q = (unsigned char *)in + inlen;

	w = 0;
	have = 0;
	mask = (1 << nbits) - 1;

	while (outlen--) {
		if (have < nbits) {
			if (p < q) {
				w |= *p++ << have;
				have += 8;
			} else {
				/* Should never happen. Input must be large enough. */
				ZEND_ASSERT(0);
				break;
			}
		}

		/* consume nbits */
		*out++ = hexconvtab[w & mask];
		w >>= nbits;
		have -= nbits;
	}

	*out = '\0';
}

https://github.com/php/php-src/blob/master/ext/session/session.c#L269

 

 

I asked why because I hope you're not using this for security purposes, based on the fact that you had to ask this question in the first place you're more likely to end up reducing security rather than increasing it. 

Expand  

 

Thanks dude, that's what I thought too.  I like your idea of the Base encoding it, I may use Base56.

 

The reason I had asked is because I wanted to learn about forming such strings, and not necessarily for any foolish attempt at security.  The formation of BitCoin addresses such as "1BoNtSLRHtKNngkdx3e0bR7gb53L3TtpYt" first peeked my curiosity, then when I discovered PHP sessions can also include  a-zA-Z and 0-9 when setting it's sid_bits_per_character to 6 prompted me to enquire with learned people here.

 

  • 0
  On 06/08/2018 at 03:26, Brian Miller said:

Thanks.

It's funny when I make inquiries, rather than answering I am offered personal opinions of understanding.  

 

While you offered some answers, you added your ego (or lack of worldly experience) in to the mix.  You do not need to know why I want something.  The comment of "I can't think of a good reason for doing this" is naive and immature.  Of course you cannot think of a good reason to do this; it's because you haven't lived my life; surely you understand that. But mostly that comment is completely a relevant.

 

In future, just answer the question and don't interject your immaturity in to your response.

 

Cheers mate.

Expand  

Err... ok. 

 

As we are offering tips to one-another let me give you one: a bit of context can go a long way in getting an answer.  Developers like solving problems, and without detail of the underlying problem it is difficult to help. 

 

The comment was intended to probe for context (suggest you read about the 5 whys) so that I can try my best (within the bounds of my knowledge) to help you come to an answer quicker.  I apologise for the negative impression that you drew from it.

  • Like 2
  • 0
  On 06/08/2018 at 03:26, Brian Miller said:

Thanks.

It's funny when I make inquiries, rather than answering I am offered personal opinions of understanding.  

 

While you offered some answers, you added your ego (or lack of worldly experience) in to the mix.  You do not need to know why I want something.  The comment of "I can't think of a good reason for doing this" is naive and immature.  Of course you cannot think of a good reason to do this; it's because you haven't lived my life; surely you understand that. But mostly that comment is completely a relevant.

 

In future, just answer the question and don't interject your immaturity in to your response.

 

Cheers mate.

Expand  

 

Mate, I see your post count and reputation, but this doesn't mean you should act like an a**hole  You are not paying those people to have such expectations for their answers. From my point of view those were relevant and polite answers and doing their best to help you.

 

...and yes adding my ego is perfectly fine on public forum.

 

Have a nice day!

  • 0

Hello,

 

SHA-256 and SHA-512 output their results in hexadecimal notation, which is why you see 0-9 and a-f used in the results--those are the sixteen digits which compose hexadecimal notation.

 

Instead of having to re-write the hashing algorithms to provide your own numbering system, perhaps it would be better to use something like SSDeep, instead, which supports a larger encoding set?

 

Regards,

 

Aryeh Goretsky

 

  • 0
  On 06/08/2018 at 03:26, Brian Miller said:

You do not need to know why I want something.

Expand  

You clearly do not understand how forums work... That you got the answers you got is way more than I would ever in a million years given you..  With such a comment when asked why..

  • 0

Just encode the hash, shortest practical encoding I can find is base85.

 

Encoding: input -> SHA256/512 -> base85

Decoding: base85 -> SHA256/512 -> Find input data with hash

 

There are multiple common base encodings: base2(1), base10(2), base16(3), base32, base36, base58, base64(4), base85, base91(5), base128(6)

 

Above base encodings have a default character set of X characters that are being used for encoding, but it's possible to replace those with your own character set.

 

  1. Base encoding of binary data (0 & 1)
  2. Base encoding of a decimal number (0-9)
  3. Base encoding of hexidecimal string like a SHA512 hash (A-F0-9)
  4. Base encoding commonly used for encoding binary data to a string to embed it in websites
  5. Base encoding with most printable characters
  6. Base encoding of a byte and ascii string

 

BUT

If you actually try to encode your hash you will find the string doesn't become shorter ?

 

Original:

seahorsepip

SHA512: 

F9AA2F6D639C026E3325F31247E8253987D6EC6EEC7E93764F9F3CC25D08FABA7DF95FAF94779CACF22D72F96EEE88D46C90A8CE727944218A1DC272EDA29084

base85: 

mMA+2gdBe&hzWuXfFUKPgCZ^zmL=+Og=E&.gbQ<Fi5:k-mme$4mmf15iwSPLg!6F{gEBI+hafbTmNovbh:*a}hax(3iw-VNiyu81mLV.!h.)&.hBQ56i5<q.hBxFUk@.k2h.)rLg=c%xi6/n!lOZOQmmoA%iwrAH

 

Why doesn't it become shorter?

When you create a hash from data it returns a hash string in the hexadecimal format, also known as base16.

So when you encode the hash as a string using base85 you actually tell the base85 encoder that the input is an ascii string (base128), so that means you're encoding a base128 string to a base85 string which results in a longer instead of shorter string!

 

How to fix this?

Make sure to actually let the base85 encoder know that the input format is base16.

So to do that you can actually convert the hex string to bytes(base128) with hex2bin in php for example and use those bytes as input for the base85 encoder.

This means the encoding would be: input -> SHA256/512(base16) -> bytes(base128) -> base85

 

Original:

seahorsepip

SHA512: 

F9AA2F6D639C026E3325F31247E8253987D6EC6EEC7E93764F9F3CC25D08FABA7DF95FAF94779CACF22D72F96EEE88D46C90A8CE727944218A1DC272EDA29084

base85: 

}kM8+w1iQrgBqS9n9zlVHT$*f)0+dwpOf+^t)QqrEFElNLY@U0[?1[TzTJtgy(>QvA^p4@IxfMO)v]X}

And even shorter base91 (only 1 char shorter in this example):

?e#a_*!Og$d0Rh"Y4Qx.=8}^zpmb~^B4aGWI;`W=?}5&b`B3w0Exl`S[GYF#fG9.1,vcLH]LR%LxhzQ

 

And the encoded string is now shorter :D

 

Php libraries to do this:

hex2bin: http://php.net/hex2bin

base85: https://github.com/tuupola/base85

 

So php code to create a shorter hash:

$shorterHash = $base85->encode(hex2bin(hash($file)))

 

Update:

Seems like you wanted to create a custom base56 encoding, to do that we could manually create functions encode and decode it:

$base56_digits = '0123456789ABCDEFGHIJKLMNOPQRSTVWXYZabcdefghijklmnopqrstv';
$custom_digits = 'mnopqrstvwxyzABCDEFGHIJKL0123456789-_=*^#@!()[]{}<>;:,.?';

function encode($base16) {
    global $base56_digits, $custom_digits;

    $base56 = base_convert($base16, 16, 56);
    $custom = strtr($base56, $base56_digits, $custom_digits);

    return $custom;
}

function decode($custom) {
    global $base56_digits, $custom_digits;

    $base56 = strtr($custom, $custom_digits, $base56_digits);
    $base16 = base_convert($base56, 56, 16);

    return $base16;
}

But above doesn't work since php base_convert is limited to base36 :(

Instead you can use a magnificent 3rd party library: https://github.com/ArtBIT/base_convert

 

And then you have:

$custom_digits = 'mnopqrstvwxyzABCDEFGHIJKL0123456789-_=*^#@!()[]{}<>;:,.?';

function encode($base16) {
    global $custom_digits;

    return math\base_convert($base16, 16, $custom_digits);
}

function decode($custom) {
    global $custom_digits;

    return math\base_convert($custom, $custom_digits, 16);
}

Original:

seahorsepip

SHA512: 

F9AA2F6D639C026E3325F31247E8253987D6EC6EEC7E93764F9F3CC25D08FABA7DF95FAF94779CACF22D72F96EEE88D46C90A8CE727944218A1DC272EDA29084

Custom base56: 

n<^(q8}=_G@x0;B1]K6zD-DF*96yE-6L#_>K8vJ},vCz02m,8yB][4qA^12>.pw>2-?_m,{0L<qFCK:K,2@04)3s:

 

TL;DR

All data is encoded in a specific base, data can be represented as a shorter string by increasing it's base and can be respresented with a smaller character dictionary by decreasing it's base.

 

Oftopic:

  Quote

Stop the bickering back and forth, we're here to learn things and help each other, if someone doesn't want to share why he wants to do something then that's his right.

Though that doesn't mean that you have to be rude about it, if you don't want to share the why, let others know in a respectful manner.

Expand  

 

  • Like 2
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • New Outlook for Windows landing in Microsoft 365 Education accounts next January by Paul Hill Microsoft has announced that from January 2026, Microsoft 365 Education users will be able to switch to a “modern, AI-powered” Outlook for Windows. The Redmond giant said that a new toggle will show up for eligible users that will let them decide whether to run classic Outlook or the new Outlook for Windows. The new version, Microsoft said, brings numerous improvements including more streamlined communication, features to boost your productivity, and more ways to personalize your experience. As this update is coming to Microsoft 365 Education accounts, it means it’ll affect both students and teachers. The refreshed interface more deeply integrates with Microsoft 365 Copilot to deliver: smarter email and calendar management with features such as Pinning and Snoozing; personalized themes that match your style and preferences; and Copilot-powered assistance to help you draft, summarize, and organize your emails and appointments. Microsoft is obviously keen for users to switch over to the new interface when it's available to foist AI on you. However, if you are reticent about using these AI tools, you can always revert back to the classic Outlook for Windows, at least for a while after the new version is released. The new toggle will begin rolling out globally next January. Before then, Microsoft 365 Education users will get in-app notifications ahead of the change to prepare them, and educational organizations will have the power to opt out or manage the experience through administrator settings. While most Microsoft 365 Education users will get the toggle, there are some exceptions. These include: organizations that have opted out of automatic migration; users with perpetual licenses; on-premises account holders; and devices where the toggle is hidden via policy. The upcoming changes will not require any administrative action during the rollout. Microsoft says administrators, if they want to, can notify users about the upcoming change; update their internal documentation; and review the documentation for Admin control over migration. The firm said that after the rollout, policy controls will become available through Group Policy Objects (GPO), Cloud Policy, and Intune. In terms of compatibility with other platforms, the new Outlook for Windows will be compatible, including with Mac. Users will still be able to access their accounts normally through the Outlook app available for macOS or via the Outlook web app.
    • Might be a joke to a shortsighted person, but that's the minimum for ANY EV to be as reliable and hassle-free as my '12 Highlander Hybrid. Sure my Hybrid gets even less at 372mi (in the winter) - 455mi (in the summer) on a tank, but at least I can refill it for another 372mi-455mi in merely 10 minutes or less without ANY of the following worries: Without all of that peace of mind, an EV will require 800mi/charge to give similar peace of mind for me. I ain't buying and worrying about separate vehicles just for city commuting and long road trips. That's a total waste of cash IMO. My Highlander Hybrid perfectly suits both use cases, and for any EV to be worth my consideration, they'll have to suit them too. Maybe for you that solely resides in 1 city and never travel by road it makes no sense, but for anyone with experience of traveling across cities here in Canada, it makes total sense given the long distances. I ain't suddenly adding another 1hr to my road trip just for EVs, mere coffee and snacks suffice for my trips (~800km), and those certainly don't take me 1hr.
    • Microsoft will reportedly hit Xbox division with a massive wave of layoffs soon by Pulasthi Ariyasinghe It was only a few days ago that a report came out regarding a mass layoff wave at Microsoft. While that report only mentioned sales staff that were to be affected, now, another report has landed that mentions the company's Xbox gaming division being another layoff target. According to Bloomberg's Jason Schreier, Microsoft has another major round of layoffs planned for the Xbox division as it reorganizes the company ahead of the new fiscal year. This unfortunate news for many staff will be announced as soon as next week, per the report Bloomberg has gained the information from sources that had asked not to be named, who had said that managers inside Xbox are already expecting the job cuts to take effect soon. While no specific developers, divisions, or numbers were mentioned, "substantial cuts" are reportedly incoming as a part of this change. The gaming division of Microsoft has gone through multiple layoff waves in recent years. Late 2024 saw 650 jobs being cut from Xbox corporate and supporting teams. Earlier in the same year, over 1,900 staff from Activision Blizzard, ZeniMax, and Xbox development teams were let go from Microsoft. Microsoft has released a number of high-profile first games as of late, including Doom: The Dark Ages, Avowed, and Indiana Jones and the Great Circle, and is also pursuing a multi-platform strategy that has Xbox games releasing on PlayStation platforms. It has also confirmed next-gen Xbox hardware is now in development and recently unveiled a first-ever official Xbox handheld initiative. If the latest report turns out to be accurate, we should have an official announcement sometime next week. Microsoft's current fiscal year is slated to end on June 30.
    • LibreOffice is fine for individual use, but its collaboration tools are far behind Office 365.
    • First thing I wondered as well. I miss the old color. I always thought it was nice how each app had its own color. Then they go and take 2 of the most commonly used apps and give them the same color.
  • Recent Achievements

    • Dedicated
      Parallax Abstraction earned a badge
      Dedicated
    • First Post
      956400 earned a badge
      First Post
    • Week One Done
      davidfegan earned a badge
      Week One Done
    • First Post
      Ainajohn earned a badge
      First Post
    • Conversation Starter
      sophiaisabella32 earned a badge
      Conversation Starter
  • Popular Contributors

    1. 1
      +primortal
      593
    2. 2
      ATLien_0
      223
    3. 3
      Michael Scrip
      170
    4. 4
      +FloatingFatMan
      152
    5. 5
      Som
      135
  • Tell a friend

    Love Neowin? Tell a friend!