Jump to content

  • Log In with Google      Sign In   
  • Create Account

Miscellaneous Programming Notebook

So I was bored and this happened..

Posted by , 29 June 2013 - - - - - - · 811 views

I couldn't get to sleep tonight (it's 6 in the morning) so I did what all software developers do when idle - I created an assembly language. No, really, I wanted to try my hand at creating some sort of toy interpreted assembly prototype for a while now.

So, without further ado, behold the next trendy programming language, aptly codenamed "asm"! Posted Image

here be dragons


; Calculates the GCD of two integers

	; Pop arguments off the stack (we know how many we expect)
	; Now R0 and R1 contain the two arguments (in correct order)
	JMP loop

	; If R1 is zero, we're done and gcd is R0
	CMP R1 0
	EQL end

	; Else do standard operation
	MOV R2 R1
	MOV R1 R0
	MOD R1 R2
	MOV R0 R2
	JMP loop

	; Clear the stack

	PUSH 1  ; Size of output array (a single integer)
	PUSH R0 ; The GCD of the two inputs
	RET     ; We're done!


; Outputs the first N terms of the Fibonacci sequence

	; Pop arguments off the stack (we read the number of elements desired)
	; Now R0 contains the number of Fibonnaci elements desired

	; Initialize R1 and R2 to the Fibonnaci sequence
	MOV R1 0
	MOV R2 1

	; Push the counter as the stack output size

	JMP head

; This just takes care of outputting the first two terms if necessary
	; If we are asked for zero terms.. make it so
	CMP R0 0
	EQL end

	; Output the first term

	; Do we want the second term?
	CMP R0 0
	EQL end

	; Output the second term

	; Begin main loop
	JMP loop

	; If R0 is zero, we've got all the terms we required
	CMP R0 0
	EQL end

	; Else, we need one more Fibonnaci term - calculate it in R3
	MOV R3 R1
	ADD R3 R2 ; R3 is now equal to R1 + R2 (the next term)
	MOV R1 R2
	MOV R2 R3 ; Shift down the two elements

	PUSH R2  ; Output the new term to the stack

	JMP loop ; Next term!

	RET ; Nothing else to do here, we've streamed out the outputs

; Adds a list of numbers passed as argument

	CNT R0 ; Get the number of input elements
	MOV R1 0 ; R1 will contain the sum

	JMP loop ; Start processing

	; If R0 is zero, we've added all the numbers
	CMP R0 0
	EQL end

	; Else pop the integer and add it
	ADD R1 R2

	; Next element
	JMP loop



And the best thing is that it actually works! This prints out the first seven Fibonacci terms:
$ ./bin/asm fibonacci.asm 7
The parsing and interpreting code is absolutely horrible, though, it started out nice but stuff got out of control after adding the third instruction type. It should be pretty easy to write the code elegantly, especially in C++ (yes, I am also suicidal and wrote the parser in C)

Anyway, the input format is basically, the first stack word is an integer representing the number of input words, those input words are then pushed on the stack. So for the Fibonnaci example, the stack upon running the program looks like this:

[1] [7] [...]

And the output format is very much the same, except the program basically erases the stack and outputs its stuff there. So after returning, it looks like this:

[7] [0] [1] [1] [2] [3] [5] [8] [...]

Yes, it's not very convenient, but you need to start somewhere right? It'll probably need an input and output stream distinct from the actual stack eventually.

Also, the POP instruction will segfault if the stack is empty, and the stack is assumed to be infinite when really, it's limited to 1024 words because I needed an arbitrary limit (later it will need to grow dynamically). The language also supports arbitrarily many registers (defined by the REGISTER keyword at the top - WORDSIZE is supposed to indicate the word size of the registers in bits but they are always 32 bits wide at the moment)

Ultimately my goal is to have some sort of prototyping language between assembly and C that I can use to play with algorithms and stuff and get valuable behavioural info (e.g. highest stack/heap usage achieved, instruction parallelism level, etc..) which would be otherwise difficult to obtain via the x86 instruction set or other (though I suppose tools already exist for this, but whatever, this is for fun too)

I have to say it's pretty thrilling to actually create some sort of parser/compiler that takes your programs and makes them work somehow. Now I know why people are so interested in designing languages.. they are turned on by compilers! Anyway, that's it for tonight.
By the way, feel free to voice your disgust at the misshapen horror I have just brought into existence Posted Image

A few updates!

Posted by , 24 June 2013 - - - - - - · 770 views

Hello GameDev! It's been long, far too long since my last entry. This is actually a new journal, because I've moved a bit from rendering lately for a bit to go back into my older, preferred hobby - cryptography, and generally screwing around writing code that nobody will probably ever use. Though I will openly admit this type of programming has immense educational value.

Back in June 2012, I started working on a cryptography library, called Ordo. A few of you might catch the reference to Neil Stephenson's Cryptonomicon. From the start it was never meant to be an all-encompassing library, so I restricted my scope to symmetric cryptography (that is, no RSA or elliptic curves, no SSL/TLS stuff, just low-level block cipher/hash function/etc code). Why? A few reasons:

1. I do not feel qualified to implement some of the more complex stuff, and I would not be a responsible developer if I released potentially unsafe code. There are a lot more edge cases to check, and it's definitely far too much work for a single person to take on.
2. Having a well-defined, bounded and achievable scope is the cornerstone of every successful project. It helps mitigate the infamous "feature creep" stage, and keeps you focused on your goals as much as possible.
3. Symmetric cryptography is a lot more interesting to me than the rest, even if all the cool kids say otherwise.

Ordo was initially created as an alternative to OpenSSL, which can be a nightmare to work with whenever you run into a problem (I think anyone who has ever used it for a non-trivial task can attest to that), and as such always had three main goals:

1. good performance and cross-platform compatibility (what is the point of creating an inferior product?)
2. easy to use and conventional API (heavily influenced from OpenSSL, with a few improvements)
3. good documentation (the OpenSSL documentation is at best sparse, and at worst nonexistent)

As such, the library was written in portable C89 with system-specific extensions. I'm sure many of you are already shifting uncomfortably in your seats, thinking "why not C++". Well, firstly, C++ doesn't make for very good libraries. Try exporting C++ classes from your library and see how many languages manage to interop with it (hint: not many). Then, there's the problem that C++ binary compatibility has always been a gotcha whereas C binary compatibility is fairly well understood. Finally, C is the lowest common denominator, and simply works everywhere. You can link to a C library (sometimes even statically) and expect it to work, in any language, on any compiler, on any platform. Furthermore, because C is comparatively simple, it was easy to get the boilerplate object-oriented code (involving function pointers and abstraction layers) out of the way and get down to actually writing algorithms.

Also, because performance matters, I decided to have code paths involving raw assembly code (most of which I wrote myself). This complicated things a bit and made for interesting challenges, but was overall worth it. Assembly is cool, and the performance gains are actually worth it most of the time. Of course, this is only used in the most performance-critical parts of the code. But you routinely see at least 120% speed improvements, sometimes up to 300% for particularly clever code. That makes a real difference. Of course there is always a standard C code path, and those assembly implementations are only enabled whenever the processor supports it.

The code itself is documented with doxygen. It's a bit hard to keep up sometimes, but it is actually reasonably well updated. Documentation has always been the hardest part, a constant struggle between consistency and redundancy, but I am determined to make it work in the end. I don't think I could bring myself to write a library without any documentation at all. I take great pride in making things work consistently and elegantly, and a complete and up to date documentation is part of all that. Code samples are important too, but those generally come at a later stage, when the library is sufficiently mature to make the samples meaningful (though I have already implemented a couple).

Finally, a very important part of library development is design. It needs to make sense, be well-structured, and not have stupid inter-dependencies. This is a header dependency graph for the entire library, at its latest version. The overall design has been refactored well over a dozen times, and I find dependency graphs an excellent way to eyeball the general code structure and a first step to spot any possible candidates for refactoring.

Posted Image

The internal dependency graph (of the source files) is more complicated, but the above is what is actually exposed to the user. As you can see, I ultimately went with a modular design. It turns out many of the features of the library can be stripped out without issues, when running very specialized builds or working in constrained environments. Also notice that the graph does not include any actual algorithm references (there are no SHA-256 or AES headers, for instance). This means the abstraction layers are working, and shows that the library scales.

As of the time of this writing, Ordo runs on (at least) Windows 32-bit, Windows 64-bit and all Linux distributions, on x86 and amd64 processors. I have plans to add BSD compatibility but a few issues need to be resolved first, and Mac compatibility is actually unknown (I have never tried, though I suspect it will not be much work to implement).

The github repository is at https://github.com/TomCrypto/Ordo.


All in all I have to say that writing a library of a larger scale than your average python script is an interesting learning experience and I'm learning a lot about software design/maintenance, even if I don't expect the library to ever become popular (plus cryptography is more or less saturated with existing open source or proprietary libraries, and adoption rate is extremely low by nature).

There is still a lot of work to do, and I think the finished product might be at least portfolio-worthy Posted Image

By the way, contributions are highly welcome, as always (should you be interested in the field)

EDIT: currently working on Mac compatibility, and there are a few API improvements in progress.

June 2013 »


Recent Entries

Recent Comments