[M3devel] AND (…, 16_ff)… Not serious - or so I hope!

Dirk Muysers dmuysers at hotmail.com
Wed Jun 27 09:58:28 CEST 2012


Some time ago I have started to develop a unicode library based
on the old M3 text model but using UTF-8 internally rather than
Latin-1 (see README attachement). For reasons best known to
me I had to put it on the backburner in favour of more urgent work.
If anybody is interested in furthering this solution I would eagerly
give the existing (pre-alpha) code away.
This being said, there are certainly better hash algorithms than the
one used by m3core (eg Goullburn, see
http://www.clockandflame.com/media/Goulburn06.pdf).

--------------------------------------------------
From: "Mika Nystrom" <mika at async.caltech.edu>
Sent: Wednesday, June 27, 2012 3:54 AM
To: "Dirk Muysers" <dmuysers at hotmail.com>
Cc: <m3devel at elegosoft.com>
Subject: Re: [M3devel] AND (…, 16_ff)… Not serious - or so I hope!

> Memory is always potentially a problem!!!!
>
> One of the main reasons my group was slow at switching from PM3 to CM3
> was because we were processing node names for chip designs as TEXTs.
>
> Chip designs tend to be deeply hierarchical and you wind up printing a
> lot of strings such as
>
> a.b.c.d.e.f.g.h
>
> to files.
>
> That's when you run into problems with Text.Cat.
>
> And memory will always be a problem since you are always designing the
> next generation of computers with the current generation of computers.
>
> Also even if memory weren't a problem, speed is always a problem, and
> speed isn't entirely unrelated to memory.  The Text.Hash I was alluding
> to earlier hashes eight characters per iteration on a 64-bit machine,
> as long as characters are 8 bits...  If you go to 16 bits it'll take
> at least twice as long.  Furthermore if there is more than one way
> (bit pattern) to represent a single CHAR it becomes difficult to use
> algorithms that take more than one at a time.
>
>    Mika
>
> "Dirk Muysers" writes:
>>So let them hate it. Memory is not a problem anymore.
>>
>>--------------------------------------------------
>>From: "Hendrik Boom" <hendrik at topoi.pooq.com>
>>Sent: Tuesday, June 26, 2012 8:19 PM
>>To: <m3devel at elegosoft.com>
>>Subject: Re: [M3devel]AND (…, 16_ff)… Not serious - or so I hope!
>>
>>> On Tue, Jun 26, 2012 at 12:18:41PM +0200, Dragiša Durić wrote:
>>>> This piece of code, from TextClass.m3, disturbs me… a lot.
>>>>
>>>> If we are to use WIDECHAR, I think we must be a lot more serious than
>>>> this.
>>>>
>>>> Probably, text pieces are limited to 128 bytes by design, somewhere.
>>>> But - whose idea was to "narrow" by ignoring everything except 8 LSB's?
>>>> By mapping set of 2^20 elements to set of 2^8 elements.
>>>>
>>>> Probably by someone whose mother tongue is fully writeable with ASCII 
>>>> :).
>>>
>>> I'm told the Japanese hate UTF-8, because it expands  their characters
>>> from two bytes to three.
>>>
>>> -- hendrik
>>>
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20120627/dff0cbd2/attachment-0002.htm>


More information about the M3devel mailing list