How compilers work

Use this forum to discuss the philosophy of science. Philosophy of science deals with the assumptions, foundations, and implications of science.
Post Reply
User avatar
EMTe
Posts: 779
Joined: March 28th, 2012, 5:58 pm
Favorite Philosopher: Jessica Fletcher
Location: Cracow

How compilers work

Post by EMTe » July 30th, 2012, 5:06 pm

A couple of days ago my friend asked me this question. Basically, he couldnt understand how it happens that English words, commonly used in high programming languages turn into machine code. I got a little interested in the topic and asked my friend, who codes, and what I received is this message:

"From a philosophical standpoint, the first thing to be aware of is that everything is Data. Both data (an MP3 file for example) and a program (Firefox browser) are Data and stored in computer's Memory (not to be confused with storage, like hard drive) indistinguishable from each other. This is called Von Neumann architecture, after the genius who came up with a lot of really great stuff. This is also why viruses and other hacks are possible and why programs sometimes crash. The computer is fooled to execute data as a program when they accidentally or on purpose read Data from Memory from a wrong place.

In essence, what a compiler does is translate a piece of data that somewhat resembles English (a text file having source code in it) to instructions for a computer to do things. A while ago there was a nice Turing machine as Google's front page, which nicely illustrates some aspects of this. The paper is Data and the Computer reads the next icon and executes whatever is there and it has. However, here the memory and code are isolated from each other.

So, taking the Google Turing machine example the compiler converts English sentences "Read the next bit. If it's 0 go down" to symbols for the computer. Why this conversion does need to happen is beyond me. Probably because the CPU ultimately works on 0s and 1s and the compiler knows exactly the best order of those 0s and 1s so that the program as written (but, not necessarily, as intended) by the author runs best.

The closest programming language to the "metal" is Assembly language, which is literally just instructions to the CPU to read certain part of memory and jump to certain point of memory and write to certain point of memory. In Assembly language, each instruction from the programmer translates 1:1 to a operating instruction for the CPU.

In more traditional programming languages, I'm quite sure what the compiler does is to turn the semi-English sentences like (if x>0 then print "hello world") to operating instructions for the CPU. How it happens is way beyond my understanding."

As I said above I got a little interested in the topic and found myself immersed in the magical world of kernels, hardware abstraction layers and similar stuff, eventually finishing as far as reading how teleprinter works. I don't understand a half of what I read, regardless of language I read article in, but I noticed that even experts don't fully understand the nature of this all. I asked another of my friend, who also codes and his explanation was a bit more clear.

What he said is that the first ever compiler was simply coded in machine language (according to Wikipedia the first ever compiler is A-0 System). Whether next compilers were also coded in machine language or used the technology of the first compiler I don't know, but I felt interested enough to gather more informations from available sources.

Several Wikipedia entries may be useful if you become interested in the topic:

FLOW-MATIC, History of compiler construction, Abstraction layer and many more.
The penultimate goal of the human is to howl like the wolf.

User avatar
Scott
Site Admin
Posts: 4197
Joined: January 20th, 2007, 6:24 pm
Favorite Philosopher: Diogenes the Cynic
Contact:

Re: How compilers work

Post by Scott » July 30th, 2012, 8:16 pm

Interesting topic!

Here is something to consider: If you open up a basic text-editor like notepad and click 10 keys on your keyboard, then save the file. It will be a file made up of a series of 80 ones or zeros, each one or zero physically made up of a circuit being on or off. Each letter, number or other character you type is simply an 8-digit binary number up to 8 digits.

Programmers have to code this way, even though it is less efficient in binary terms, because it is easier on the programmer. He can remember a word like 'if' easier than a string like 0101011101000110 even though that isn't optimal since the set of possible commands and the binary syntax could be much smaller, but that would require the programmer to be able to distinguish 0101011101000110 from 0101000101100110 or any other series (or the odd characters they would give you when not looking at them in binary) as well as he can distinguish a word like 'if' from another word like 'not'. So we create an inefficient syntax for the programmer to write in, but then at some point convert it into a more efficient binary representation for whatever purpose it is needed.

My point is, all the computer ever gets is the binary, all it saves is the binary. I don't think the idea that the text entered by a programmer is converted into more computer-understandable languages is exactly correct, at least not in a philosophical sense, because it implies the computer is able to understand something at that level. But it's a mechanical process through-and-through. It's just circuits being on or off, and simple mechanical instruments dumbly reacting to streams of binary data with little processing power. The compilers and conversions are just dumb processes that change one binary string or group of strings to another according to an algorithm which was designed by humans to get a binary string that when used as the input to a device gives the desired result.

I suspect human reliance on inefficient verbal language is symptomatic of one of the ways modern computers are superior to humans.
Online Philosophy Club - Please tell me how to improve this website!

Check it out: Abortion - Not as diametrically divisive as often thought?

User avatar
EMTe
Posts: 779
Joined: March 28th, 2012, 5:58 pm
Favorite Philosopher: Jessica Fletcher
Location: Cracow

Re: How compilers work

Post by EMTe » August 2nd, 2012, 4:19 pm

This is my friend's answer to your post. It's not my point of view, since I am not a coder and certainly don't possess enough, even theoretical knowledge on subject, but I'm simply interested in reading similar discussions.

-----------------------------------------------------------------------------------------------------------------------------------------------
Scott wrote:"Here is something to consider: If you open up a basic text-editor like notepad and click 10 keys on your keyboard, then save the file. It will be a file made up of a series of 80 ones or zeros, each one or zero physically made up of a circuit being on or off. Each letter, number or other character you type is simply an 8-digit binary number up to 8 digits.
This isn't actually true, one character can actually be multiple bytes, but that's nit-picking.
Scott wrote:So we create an inefficient syntax for the programmer to write in, but then at some point convert it into a more efficient binary representation for whatever purpose it is needed.
It's not true that a programming language has to be "inefficient". In an Assembly language, each instruction maps exactly to one instruction to the processor. Basically the programmer writes "decrease value in register 1" which is the exactly same thing as "00000101" so no efficiency is lost. However, the modern computer architecture is so complex that working on such low level would be the same thing as (quoting Hofstadter) trying to look at DNA by atom by atom.
Scott wrote:My point is, all the computer ever gets is the binary, all it saves is the binary. I don't think the idea that the text entered by a programmer is converted into more computer-understandable languages is exactly correct, at least not in a philosophical sense, because it implies the computer is able to understand something at that level.
Ah, but it is. In machine code, your instruction set mostly consist of doing stuff to memory or jumping to another instruction. Even a simple high level programming code "echo "Hello world."" has so many things that the processor does not understand or even care about. For the computer to understand the task of displaying "Hello world" on display, it's broken down to many simple instructions that have very little to do with displaying stuff on screen.
Scott wrote:I suspect human reliance on inefficient verbal language is symptomatic of one of the ways modern computers are superior to humans.
I'll just leave this 64kb intro here. The computer is a big-ass calculator and is subject to the laws of mathematics. Saying that modern computers are superior to humans is the same as saying mathematics is superior to humans, but really, they are just our tools.
The penultimate goal of the human is to howl like the wolf.

User avatar
Scott
Site Admin
Posts: 4197
Joined: January 20th, 2007, 6:24 pm
Favorite Philosopher: Diogenes the Cynic
Contact:

Re: How compilers work

Post by Scott » August 4th, 2012, 7:33 pm

Scott wrote:So we create an inefficient syntax for the programmer to write in, but then at some point convert it into a more efficient binary representation for whatever purpose it is needed.
It's not true that a programming language has to be "inefficient". In an Assembly language, each instruction maps exactly to one instruction to the processor. Basically the programmer writes "decrease value in register 1" which is the exactly same thing as "00000101" so no efficiency is lost.
"decrease value in register 1" is a 28-character command. At a byte per character, that would give us a 224 digit binary number not an 8 digit one. To open up a text-editor and write that code and then have it "mapped to" a single byte is the kind of conversion to which I referred.

A more efficient system would be to go in and write a single character instead of any given 28-character command which represents a single byte of data according to some conversion mechanism. Needless to say, human beings are not capable of such efficient communication namely because of the very different way our memory works, such as that is is much more vague but much more powerful for many practical 'intelligent' applications.
Scott wrote:My point is, all the computer ever gets is the binary, all it saves is the binary. I don't think the idea that the text entered by a programmer is converted into more computer-understandable languages is exactly correct, at least not in a philosophical sense, because it implies the computer is able to understand something at that level.
Ah, but it is. In machine code, your instruction set mostly consist of doing stuff to memory or jumping to another instruction. Even a simple high level programming code "echo "Hello world."" has so many things that the processor does not understand or even care about. For the computer to understand the task of displaying "Hello world" on display, it's broken down to many simple instructions that have very little to do with displaying stuff on screen.
You seem to be contradicting yourself. Are you saying the computer actually understands -- in the philosophical sense -- what it is doing or are you agreeing with me that it does not and if anything it is less like the human-like consciousness required to have understanding in a philosophical sense at that lower-level.
I'll just leave this 64kb intro here. The computer is a big-ass calculator and is subject to the laws of mathematics. Saying that modern computers are superior to humans is the same as saying mathematics is superior to humans, but really, they are just our tools.
Interesting video and contest. I didn't say computers were superior to humans in general and I'm not sure what that would even mean. I said that efficient communication is one of the ways in which computers are better than humans. Solving math equations is another. There are plenty of ways in which humans are superior to computers. Philosophically speaking, I think we can agree that saying anything is superior to something else without qualification at least implicit qualification is nonsense. Indeed, any superiority can be reversed by using the inverse qualification. If Joe is superior in tallness to Jane, then Jane is superior to Joe in shortness.
Online Philosophy Club - Please tell me how to improve this website!

Check it out: Abortion - Not as diametrically divisive as often thought?

User avatar
EMTe
Posts: 779
Joined: March 28th, 2012, 5:58 pm
Favorite Philosopher: Jessica Fletcher
Location: Cracow

Re: How compilers work

Post by EMTe » August 5th, 2012, 3:42 pm

Scott wrote:A more efficient system would be to go in and write a single character instead of any given 28-character command which represents a single byte of data according to some conversion mechanism. Needless to say, human beings are not capable of such efficient communication namely because of the very different way our memory works, such as that is is much more vague but much more powerful for many practical 'intelligent' applications."
As it happens, there is the aptly-named f*** [[see Wikipedia link for the English vulgar word beginning with "F"]]. Eight one byte-length instructions. I would not say that it's really that powerful language, yet everything is possible in it.
Scott wrote:You seem to be contradicting yourself. Are you saying the computer actually understands -- in the philosophical sense -- what it is doing or are you agreeing with me that it does not and if anything it is less like the human-like consciousness required to have understanding in a philosophical sense at that lower-level."
I probably shouldn't have used the verb "understand". What I meant by it was that the processor mostly just modifies its memory as instructed by the program. It's an arithmetical engine, there is no sentience or no understanding. What I meant was that languages like f*** and Assembly do not introduce any concepts (like "print to screen") that are beyond these memory operations, which most other higher level languages reduce the code into.
The penultimate goal of the human is to howl like the wolf.

DanLanglois
Posts: 142
Joined: August 1st, 2012, 12:03 am

Re: How compilers work

Post by DanLanglois » August 7th, 2012, 12:22 am

How compilers work turns out to be a rather high-tech subject, naturally. If you get some exposure to Noam Chomsky's work on syntax, it's very high-tech. The syntax of programming languages is specified. Chomsky described four classes of grammars that define four classes of languages, and two of these grammar classes turn out to be useful. You may get to the point of understanding parse trees, which are hierarchical structures, but my main point is that there is a whole field to discuss, here.

User avatar
EMTe
Posts: 779
Joined: March 28th, 2012, 5:58 pm
Favorite Philosopher: Jessica Fletcher
Location: Cracow

Re: How compilers work

Post by EMTe » December 24th, 2017, 12:42 pm

*bump!*

I wonder if there's anyone alive here, from the past, or a new member, eager to enlighten me with his views.

I've been observing Wikipedia article about "abstraction layers", from time to time, but I'd learned nothing new from the changes. I think the issue we're talking here is one of the most mysterious, philosophically-wise, and interesting in the same time.
The penultimate goal of the human is to howl like the wolf.

User avatar
JamesOfSeattle
Posts: 373
Joined: October 16th, 2015, 11:20 pm

Re: How compilers work

Post by JamesOfSeattle » December 25th, 2017, 3:14 am

I'm interested in the topic, but I'm not sure what issue(s) you would like addressed. Do you have a specific question?

*

Steve3007
Posts: 4578
Joined: June 15th, 2011, 5:53 pm
Favorite Philosopher: Eratosthenes
Location: UK

Re: How compilers work

Post by Steve3007 » December 25th, 2017, 6:07 am

EMTe:

Interesting to hear from you again after all this time. As I recall, you always had quite a curious writing style but I can't remember right now the particular way in which it was curious. Perhaps if you speak again I'll remember.

I'm interested in DanLanglois's comment from over 5 years ago about Chomsky's descriptions of the grammar of programming languages. I'll have to look it up.

Anyway, back to Christmas day for now. God bless us one and all. And all that.
"When the seagulls follow the trawler, it is because they think sardines will be thrown into the sea." - Eric Cantona.

Post Reply