Hacker News new | past | comments | ask | show | jobs | submit login

Neat! I've also had pretty good luck with nasm on several platforms, including Windows, Linux, OS X, and DOS. fasm also looks pretty neat, but I haven't done much with it aside from a few toys for Menuet OS. And then there's yasm, which I haven't used, but must be fairly popular given its inclusion in several distros.

There's GNU as (gas), which I've used quite a bit, but wouldn't really recommend because it uses strange "AT&T" syntax rather than the syntax you'll find in the Intel manuals. gas is also meant more as part of the GCC pipeline than as a standalone assembler, so even though it can function as one, it's not necessarily nice as one.

I've been meaning to play around with the LLVM assembly language. It looks neat, with the bonus of being reasonably portable, but I haven't yet found the time.

HLA (High Level Assembly) by Randall Hyde seems like an interesting way to slowly lower yourself into assembly language programming, but that's not how I cut my teeth, so I can't speak to its effectiveness.




> There's GNU as (gas), which I've used quite a bit, but wouldn't really recommend because it uses strange "AT&T" syntax

All AT & T syntax is, is

move src, dst

rather than intel's

move into dst, src

as far as I know, intel is the only company which did that, and to me, it is intuitive to move something somewhere, rather than to somewhere move something.


There are much more differences (sources: http://www.imada.sdu.dk/Courses/DM18/Litteratur/IntelnATT.ht..., http://stackoverflow.com/a/9951916/497193)

In Intel syntax instructions are not suffixed. In AT&T there is a suffix (q, d, w, b) depending on the operand size. For example (assuming 32 bit register) mov (Intel) gets movl (AT&T). Also the argument order is changed. Constants have to be prefixed by $. Hexadecimal values are prefixed with 0x instead of suffixed with h. Registers are prefixed with %. With this we already have

  mov     eax,1
  mov     ebx,0ffh
vs.

  movl    $1,%eax
  movl    $0xff,%ebx
But also the notation for accessing memory is different: For encoding the SIB (Scaled Index Byte) (+ disp[lacement], if desired), Intel uses [base+disp+index * scale], while AT&T uses disp(%base, %index, scale). Thus we have

  mov edx, [ecx]
  mov eax, [ebp-4]
  mov eax, [ebp-4+edx*4]
  lea eax, [eax*4+8]
  lea eax, [eax*2+eax]
vs.

  movl (%ecx), %edx
  movl -4(%ebp), %eax
  movl -4(%ebp, %edx, 4), %eax
  leal 8(,%eax,4), %eax
  leal (%eax,%eax,2), %eax
Edit: On the other hand, when the size of the operand can't be concluded from the instruction, in Intel syntax you have to add 'BYTE PTR', 'WORD PTR', 'DWORD PTR' or 'QWORD PTR' to disambiguate the situation. For example

  mov [ebx], 2
is not unique, so in Intel's syntax you have to write respectively

  mov BYTE PTR [ebx], 2
  mov WORD PTR [ebx], 2
  mov DWORD PTR [ebx], 2


Coming from MOS 6502 / 6510 / Motorola MC680## background,

movl -4(%ebp), %eax

is intuitive to me. On the Motorola MC68000 family, it would have been

move.l -4(sp), a0

so I can hit the ground running with AT & T syntax.

The first time I saw

mov eax, [ebp-4]

I wanted to take a hammer and beat the idiotic IBM PC tin-bucket into oblivion. Did the square brackets have any special meaning? Who the hell knows! What do they mean?!? Completely unintuitive.

With AT & T syntax, I can even hit the ground running reading SPARC assembler:

mov 5, %g31

moves five to global register 31. Perfectly logical and intuitive, and I didn't even have to know anything about SPARC assembler.

Then there is the deadbeefh instead of $deadbeef or 0xdeadbeef syntax. Everybody else either used $deadbeef or 0xdeadbeef, but not intel, oh no! intel just had to be different. Irritating to no end. Again, taking a hammer to the PC bucket was a temptation...

Only intel could come up with something which does not relate to anything.


> Only intel could come up with something which does not relate to anything.

This is wrong. You have to remember that the 8086 is (mostly) source code compatible to the 8080 (at least if you rename some registers - a simple search & replace) - though not binary compatoble. The assembler syntax for the Intel 8080 that Intel developed is ugly. But Zilog who developed the Z80 (which is binary compatible to the Intel 8080) devised a much better assembly language (as far as I know they had to use a different assembly language for legal reasons). For the 8086 Intel built on the ideas behind Zilog's assembly language. In Zilog assembler

  (HL)
  (IX+index)
  (IY+index)
is used for accessing memory. Now replace the ( and ) by [ and ] and additionally keep in mind that the function of the register pair HL in 8080 roughly corresponds to bx (a "register pair" consisting of bh and bl) and it looks a lot like x86 assembler (IX and IY only exist in the Z80 and not in the 8080; but despite that the syntax for indexed adressing again reminds a lot of what one is used from x86 assembly in Intel syntax).

EDIT: Also the parameter order dst,src is the same as in Z80 assembler (but this order was already used in 8080 assembler so rather Zilog copied this order from 8080 assembler).

TLDR: The Intel syntax is related to Zilog Z80 assembly.


Intel's syntax is most common, not the AT&T one.

You only see it everywhere nowadays thanks to the rise of GNU/Linux and other open source UNIXes.

If you translate mov into =, the syntax makes much more sense than AT&T.

I mean

mov dst, src

is similar to

dst = src


> Intel's syntax is most common, not the AT&T one.

I'd be a little bit more careful with such a statement: Under Windows (and formerly DOS) Intel syntax is the common one, while under GNU/Linux and OSX the AT&T one is used.

> If you translate mov into =, the syntax makes much more sense than AT&T.

Though I prefer the Intel syntax, I'd be careful with "makes sense" here: According to http://stackoverflow.com/a/4119217/497193 people who grew up with MIPS seem to prefer the AT&T syntax since it is much more similar to MIPS assembler.


Apparently you missed my second paragraph.

Only UNIX based OSes follow AT&T, for obvious reasons.


It has nothing to do with the operating system. AT & T syntax follows the same style as pretty much any other computer and processor in existence. (Exceptions exist, but they are exotic oddities.)


> Intel's syntax is most common, not the AT&T one.

No, intel is an exception, not the norm. Amstrad, Atari, Commodore 64, Amiga, Sun (both Motorola and (Ultra)SPARC) all use "move src, dst", $ or 0x... only intel diverges from the norm.


I think most popular assemblers use dst,src not just Intel. I know z80, 68k, PPC and ARM do at least.


PPC and Motorola 680## series, as well as (Ultra)SPARC all use src, dst.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: