Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
158 views
in Technique[技术] by (71.8m points)

c# - Building an assembler

I need to build an assembler for a CPU architecture that I've built. The architecture is similar to MIPS, but this is of no importance.

I started using C#, although C++ would be more appropriate. (C# means faster development time for me).

My only problem is that I can't come with a good design for this application. I am building a 2 pass assembler. I know what I need to do in each pass.

I've implemented the first pass and I realised that if I have to lines assembly code on the same line ...no error is thrown.This means only one thing poor parsing techniques.

So almighty programmers, fathers of assembler enlighten me how should I proceed. I just need to support symbols and data declaration. Instructions have fixed size.

Please let me know if you need more information.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I've written three or four simple assemblers. Without using a parser generator, what I did was model the S-C assembler that I knew best for 6502.

To do this, I used a simple syntax - a line was one of the following:

nothing
[label] [instruction] [comment]
[label] [directive] [comment]

A label was one letter followed by any number of letters or numbers.

An instruction was <whitespace><mnemonic> [operands]

A directive was <whitespace>.XX [operands]

A comment was a * up to end of line.

Operands depended on the instruction and the directive.

Directives included .EQ equate for defining constants

.OR set origin address of code

.HS hex string of bytes

.AS ascii string of bytes - any delimiter except white space - whatever started it ended it

.TF target file for output

.BS n reserve block storage of n bytes

When I wrote it, I wrote simple parsers for each component. Whenever I encountered a label, I put it in a table with its target address. Whenever I encountered a label I didn't know, I marked the instruction as incomplete and put the unknown label with a reference to the instruction that needed fixing.

After all source lines had passed, I looked through the "to fix" table and tried to find an entry in the symbol table, if I did, I patched the instructions. If not, then it was an error.

I kept a table of instruction names and all the valid addressing modes for operands. When I got an instruction, I tried to parse each addressing mode in turn until something worked.

Given this structure, it should take a day maybe two to do the whole thing.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...