View previous topic - View next topic |
Author |
Message |
BigManJones Scholar
Joined: 22 Mar 2003 Posts: 196
|
Posted: Sat Jul 19, 2003 2:33 am Post subject: New Scripting Language; first try |
[quote] |
|
Here is the outline/design doc for my scripting language. I haven't coded anything yet, but today I figured out how stacks are used in compilers to parse nested expressions, so in a few weeks I hope to have demo or something. Those of you who have coded a scripting language, what do you think? Any ideas or hints?
EDIT: so basically this is both the vm and the language; the opcodes are given then a small example of the language wich is then given as opcodes that would be converted to bytecode for the vm.
Code: |
Super Simple Scripting Language:
This is the outline for a really basic scripting language using some basic opcodes and commands the script can call from the host application. The memory area will consist of an integer array of some large enough size. Only one data type, ints. I'm thinking of how to do strings though. The pc counter will keep track of which memory location is being executed. All global variables. No functions (yet).
Opcodes:
Two addressing modes; mm or lm. mm = memory to memory lm = literal to memory
BRA - branch
BEQ - branch when equal
BNQ - branch when not equal
BGT - mem comp mem or mem comp value
BLT
BGE
BLE
MOV - move value to memlocation
CPY - copy mem to mem
JSR - move p.c.
RTS - move p.c.
ADD - add value to memlocation
SUB - subtract
MUL
DIV
PUS - push onto stack??
POP - pop from stack??
100-199 function calls
Highlevel statements:
declare
assign
if...else
while
jump...return
math
App Cmds (commands to access the game engine):
storyflags (global?):
getflag
setflag
loadlevel
loadmap
loadgfx
loadscripts
setcam
setcamobj
movecamto
setcampos
shake
zoom
gui
dodialog-convo/story
inv/equipdialog
storedialog
statsdialog
objects
create
setpos
getpos
moveto
movetowait
ismoving
changeinv
changestat
particlegen
doscript
Scriptable Events (examples):
cutscenes
enemyai
npc's
mapstuff
platforms
switches
doors
newlevels
particlegen
dialogs
convo/story points
Sample Script: for a moving platform
declare startx,starty,platform1
assign startx 200
assign starty 50
assign platform1 create(platform,startx,starty)
while(true) {
if ismoving(platform1) == 0 {
if getposx(platform1) == 50 moveto(platform1,startx,200)
if getposx(platform1) == 200 moveto(platform1,startx,50)
}
}
Resulting Assembly (hand assembled-return value memory location after function call):
0001 =startx
0002 =starty
0003 =platform1
0004 mov 1 200
0007 mov 2 50
0010 111 003 001 002 // create opcode = 111 objecttype= 3
0014 // value return to mem after call???
0015 mov 0014 0003
0018 112 0003 // 112=ismoving(platform1) -- tag1
0020 // return from ismoving()
0021 bnq 0020 0 0050 // if 0020 == 0 bra to 0050
0025 113 0003 // 113=getposx()
0027
0029 bnq 0027 50 0038 // if != 50 goto 0038
0033 114 0003 0001 200 // 114=moveto()
0037
0038 113 0003 // getposx()
0040
0041 bnq 0027 200 0050 // if != 200 goto 0050
0045 114 0003 0001 200 // moveto()
0049
0050 bra 0018
|
|
|
Back to top |
|
|
valderman Mage
Joined: 29 Aug 2002 Posts: 334 Location: Gothenburg, Sweden
|
Posted: Wed Jul 23, 2003 10:03 pm Post subject: |
[quote] |
|
I think the two addressing modes might slow the VM down. I have a crapload of instructions for each task in my VM (for example memcopys: const to mem, mem to mem, const to reg, mem to reg, reg to mem), so I don't have any unnecessary ifs or switches here and there, and I think I've got it running pretty fast for a VM. You can check it out in the engines section of RPGDX (it's called Valderman's Virtual Machine ^_^).
As for the high-level things, I don't know much about that. I have yet to create a compiler othern than a basic (and SLOW) assembler for my VM. _________________ http://www.weeaboo.se
|
|
Back to top |
|
|
BigManJones Scholar
Joined: 22 Mar 2003 Posts: 196
|
Posted: Thu Jul 24, 2003 1:27 am Post subject: |
[quote] |
|
A response, cool!
Quote: | I think the two addressing modes might slow the VM down. I have a crapload of instructions for each task in my VM (for example memcopys: const to mem, mem to mem, const to reg, mem to reg, reg to mem), so I don't have any unnecessary ifs or switches here and there, and I think I've got it running pretty fast for a VM. |
Thats the conclusion I came to also; to much trouble to decipher a bunch of wierd tokens. In the vm I'm thinking of doing a few ifs with some cases like this (assuming that BRANCH instruction are lower in number than MATHS):
Code: |
if(nextinstruction <= BRANCHES) {
case nextinstruction:
when BRA....
when BEQ....
}
if(nextinstruction <= MATHS) {
case...
}
|
Its like a binary search tree in code; ie a decision tree.
So far as the compiler, the key is using a stack. You push instructions, operands, and memory locations until you get to one of higher priority (a closing brace for example) then pop them all back out and theres your bytecode. (you have to push a couple of 'program counter' numbers too). I haven't worked out how to parse the source file yet though.
|
|
Back to top |
|
|
valderman Mage
Joined: 29 Aug 2002 Posts: 334 Location: Gothenburg, Sweden
|
Posted: Thu Jul 24, 2003 10:16 am Post subject: |
[quote] |
|
Parsing the source file is what's keeping me from writing one. I'll have to see if the famous "Dragon book" is available online.
Quote: | Thats the conclusion I came to also; to much trouble to decipher a bunch of wierd tokens. In the vm I'm thinking of doing a few ifs with some cases like this (assuming that BRANCH instruction are lower in number than MATHS): |
This doesn't look very fast, unless it gets optimized by the compiler in some way I'm unaware of. I think the two best approaches are:
a) Have a huge switch statement with all the opcodes, wich compiles into a jump table, provided you have enough instructions. This is the fastest option, while the code gets a bit messy. Call inlined functions from the switch to make it a little bit cleaner without having to deal with function overhead.
b) Create a function table and call different functions for each opcode, like this: InstrFuncs[iOpcode](pParameters);
Makes the code cleaner at the cost of speed, since you get the function call overhead.
If you are really fond of OO, you could do something like:
Code: | class CInstruction
{
public:
CInstruction() : m_pData(0), m_iOpcode(0) {}
~CInstruction() {if(m_pData) delete [] m_pData;}
int Load(char *pBuffer);
virtual void Exec() = 0;
private:
char *m_pData;
int m_iOpcode;
}; |
...and then derive a new class frmo CInstruction for each instruction, overloading only the Exec() function.
This would probably be a lot slower than any of the aforementioned approaches, although it will most likely make for nice-looking code. _________________ http://www.weeaboo.se
|
|
Back to top |
|
|
BigManJones Scholar
Joined: 22 Mar 2003 Posts: 196
|
Posted: Fri Jul 25, 2003 1:34 am Post subject: |
[quote] |
|
Quote: | a) Have a huge switch statement with all the opcodes, wich compiles into a jump table, provided you have enough instructions. This is the fastest option, while the code gets a bit messy. Call inlined functions from the switch to make it a little bit cleaner without having to deal with function overhead.
|
Whats a jump table?
|
|
Back to top |
|
|
valderman Mage
Joined: 29 Aug 2002 Posts: 334 Location: Gothenburg, Sweden
|
Posted: Fri Jul 25, 2003 11:03 am Post subject: |
[quote] |
|
If you have opcode values ranging from, say, 5 to 103, the jump table would look something like this (but probably with a better solution for the first two CMPs):
Code: |
CMP opcode, 5
JL default
CMP opcode, 103
JG default
JMP [table+opcode]
; Opcodes here
default:
|
In short, the opcodes are made indices into a table of the memory locations to jump to for each opcode. _________________ http://www.weeaboo.se
|
|
Back to top |
|
|
BigManJones Scholar
Joined: 22 Mar 2003 Posts: 196
|
Posted: Sun Jul 27, 2003 2:29 pm Post subject: |
[quote] |
|
Is the code you posted inline assembler?
One thing is, I'm not using registers in my vm, I'm doing it all in code. Here is my first try at the inner loop in java with 4 opcodes:
Code: |
100 = mov
101 = prt
102 = add
103 = bne
pc=001;
running=1;
while(running==1) {
if(memspace[pc]==100)
memspace[memspace[++pc]]=memspace[++pc];
else if(memspace[pc]==101)
System.out.println("/"+memspace[memspace[++pc]]);
else if(memspace[pc]==102)
memspace[memspace[++pc]]+=memspace[++pc];
else if(memspace[pc]==103) {
if(memspace[memspace[++pc]] != memspace[++pc])
pc=memspace[++pc];
pc--;
}
else if(memspace[pc]==999) running=0;
pc++;
}
|
Of course I can't use inline assembler in java. Why java you ask? 1. Cant use lua with java 2. I'd like to make some java games.
|
|
Back to top |
|
|
LeoDraco Demon Hunter
Joined: 24 Jun 2003 Posts: 584 Location: Riverside, South Cali
|
Posted: Mon Jul 28, 2003 7:50 am Post subject: |
[quote] |
|
You can go about the creation of scripting languages -- in general -- via parser and lexer functions that split up the work for you. Fairly good tools -- and free, at that -- exist for the programmer who does not wish to roll their own. I would suggest Flex and Bison. Flex generates lexers (aka scanners), while Bison generates parsers. All you feed them is the list of tokens to search for (Flex) and the combinations of those tokens (Bison).
Stacks are not the only way information is stored for a runtime representation of source code. Stacks are used for this purpose (mostly to hold variable declaration depth, to resolve ambiguity); however, you'll find a tree hierarchy to be a more smooth approach to translating whatever grammer you have into a dynamic representation. There are benefits of this approach: (a) analysis of code is simple, as each node in the tree knows what children it has, and can therefore analyse those children. This allows for type-checking (if you want types) or for dynamic-casting, should you want an untyped grammer. (b) code generation (or execution) is also very simple, as each node knows what children it has, and what order those children should be executed in.
And really: you can create a run-time representation of relatively complex pieces of source in a single pass in practically no time at all with an OOD. A second pass to analyze the code takes no time at all. Generation of destination code (or execution of other objects) is swift. Three-pass, hierarchal tree compilers are simple to build and moderately swift on modern compilers. (Usually the slowest phase of a compiler -- after the parse tree has been constructed -- is in optimizing the destination source code, as this is an open-ended operation. Theoretically, you could have a compiler attempt to optimize its output for years without ever coming to a perfectly optimized representation of your source.) _________________ "...LeoDraco is a pompus git..." -- Mandrake
|
|
Back to top |
|
|
valderman Mage
Joined: 29 Aug 2002 Posts: 334 Location: Gothenburg, Sweden
|
Posted: Mon Jul 28, 2003 9:31 am Post subject: |
[quote] |
|
BigManJones wrote: | Is the code you posted inline assembler?
One thing is, I'm not using registers in my vm, I'm doing it all in code. Here is my first try at the inner loop in java with 4 opcodes:
Code: |
100 = mov
101 = prt
102 = add
103 = bne
pc=001;
running=1;
while(running==1) {
if(memspace[pc]==100)
memspace[memspace[++pc]]=memspace[++pc];
else if(memspace[pc]==101)
System.out.println("/"+memspace[memspace[++pc]]);
else if(memspace[pc]==102)
memspace[memspace[++pc]]+=memspace[++pc];
else if(memspace[pc]==103) {
if(memspace[memspace[++pc]] != memspace[++pc])
pc=memspace[++pc];
pc--;
}
else if(memspace[pc]==999) running=0;
pc++;
}
|
Of course I can't use inline assembler in java. Why java you ask? 1. Cant use lua with java 2. I'd like to make some java games. |
It's x86 assembly, as accepted by Microsoft's MASM and VS.NET inline assembler. I think you missed my point, however. You shouldn't write the ASM yourself, any decent compiler, no matter the language (most good Java JIT compilers too), should be able to translate a switch...case statement into a jump table.
The if...else statements you presented will most likely be slow as molasses compared to any of my three aforementioned approaches. It requires at least three instructions for each opcode, even if the opcode isn't the one to execute.
Let's say you have 50 opcodes, and opcode #49 is called. That gives you an overhead of 150 opcodes for one single instruction - and that is if you are lucky, it might be far more.
With a jump table, OTOH, you will only have ~5 instructions of overhead, no matter which opcode you execute.
LeoDraco wrote: | You can go about the creation of scripting languages -- in general -- via parser and lexer functions that split up the work for you. Fairly good tools -- and free, at that -- exist for the programmer who does not wish to roll their own. I would suggest Flex and Bison. Flex generates lexers (aka scanners), while Bison generates parsers. All you feed them is the list of tokens to search for (Flex) and the combinations of those tokens (Bison).
Stacks are not the only way information is stored for a runtime representation of source code. Stacks are used for this purpose (mostly to hold variable declaration depth, to resolve ambiguity); however, you'll find a tree hierarchy to be a more smooth approach to translating whatever grammer you have into a dynamic representation. There are benefits of this approach: (a) analysis of code is simple, as each node in the tree knows what children it has, and can therefore analyse those children. This allows for type-checking (if you want types) or for dynamic-casting, should you want an untyped grammer. (b) code generation (or execution) is also very simple, as each node knows what children it has, and what order those children should be executed in.
And really: you can create a run-time representation of relatively complex pieces of source in a single pass in practically no time at all with an OOD. A second pass to analyze the code takes no time at all. Generation of destination code (or execution of other objects) is swift. Three-pass, hierarchal tree compilers are simple to build and moderately swift on modern compilers. (Usually the slowest phase of a compiler -- after the parse tree has been constructed -- is in optimizing the destination source code, as this is an open-ended operation. Theoretically, you could have a compiler attempt to optimize its output for years without ever coming to a perfectly optimized representation of your source.) | I strongly suggest against interpreting the scripts at run-time, since a) whatever way you do it, it will be a lot slower than interpreting compiled bytecode, and b) it will be too easy for users to modify scripts.
This post, however, is very useful when it comes to implementing the compiler. _________________ http://www.weeaboo.se
|
|
Back to top |
|
|
janus Mage
Joined: 29 Jun 2002 Posts: 464 Location: Issaquah, WA
|
Posted: Mon Jul 28, 2003 10:19 am Post subject: |
[quote] |
|
Um, sorry if I sound stupid, but I've never understood why anyone in their right mind would want to create a scripting system that just apes assembly. I've never found assembly the least bit easy to code in and it seems to me like hardcoding your gameplay mechanics would be easier than having to write them in an assembly knockoff... am I missing something here?
|
|
Back to top |
|
|
BigManJones Scholar
Joined: 22 Mar 2003 Posts: 196
|
Posted: Mon Jul 28, 2003 12:50 pm Post subject: |
[quote] |
|
Well, I've got most of the vm done with basic opcodes and using a few if/else and some switches mixing them like I said above. The only problem is that to do a time test I'll have to run ALOT of comparisons; like 10,000's.
I've been doing some reading and it seems that compilers are much simpler to write if you assume the grammar is correct; not good if you want other people to use your stuff, but, yeah.
Janus wrote: | Um, sorry if I sound stupid, but I've never understood why anyone in their right mind would want to create a scripting system that just apes assembly. I've never found assembly the least bit easy to code in and it seems to me like hardcoding your gameplay mechanics would be easier than having to write them in an assembly knockoff... am I missing something here? |
Well, what I'm going to do is have opcodes that when run in the vm will call functions in the game and then possibly return a value to the program. I wasn't going to script the gameplay, just events. Does that make a little more sense? Also, if you have bytecode then your language/compiler can take on any form. A pascal compiler could be written to generate bytecode that runs on the Java vm. And what other way, besides assembly style, is there to make a scripting language?
|
|
Back to top |
|
|
Rainer Deyke Demon Hunter
Joined: 05 Jun 2002 Posts: 672
|
Posted: Mon Jul 28, 2003 5:02 pm Post subject: |
[quote] |
|
Valderman wrote: | I strongly suggest against interpreting the scripts at run-time, since a) whatever way you do it, it will be a lot slower than interpreting compiled bytecode, and b) it will be too easy for users to modify scripts.
This post, however, is very useful when it comes to implementing the compiler. |
That really depends on what you mean by "interpreting a run-time". If you compile into either a byte-code or tree-based representation at load time, there's no significant speed loss compared to pre-compiling, unless you've got a lot of scripts.
In the end you're going to interpret something at run-time, either a tree or byte-code. I've found that using a tree is at least as fast as the byte code, and it's easier to program. So I really don't see the point of using byte code at all. If you need to pre-compile, pre-compile to tree form.
But, yes, if you try to interpret your scripts directly instead of building a tree or byte code, it's going to be slow. Same if you build your tree or byte code every time you run a script.
|
|
Back to top |
|
|
LeoDraco Demon Hunter
Joined: 24 Jun 2003 Posts: 584 Location: Riverside, South Cali
|
Posted: Mon Jul 28, 2003 5:55 pm Post subject: |
[quote] |
|
Valderman wrote: | I strongly suggest against interpreting the scripts at run-time, since a) whatever way you do it, it will be a lot slower than interpreting compiled bytecode, and b) it will be too easy for users to modify scripts.
This post, however, is very useful when it comes to implementing the compiler. |
The only way in which a tree-based representation would be significantly slower is in the construction of the tree. Even that wouldn't be entirely too heavy an operation, assuming that all you are doing is constructing the tree in one pass.
And who cares if script files are easily editable? If people want to take the time to look at the format to such a degree as to change the script files, what difference will it make to have script code over bytecode? A programmer who is relatively proficient with whatever language you emulate will be able to pick up a syntax regardless of how it is set up. An easily modifiable engine is a good thing, as it can provide for a large community. I can understand not wanting a simple rpg engine to have that degree of modification (as it somehow interfers with the author's "vision"); however, I don't see how making script files difficult to edit will make things more difficult for people who are determined to alter the game. JoeUser -- who more than likely has no programming knowledge -- would not understand a c-syntax language any more than an x86 asm-syntax language.
Plus, by directly interpreting the scripts, you don't have to recompile the scripts prior to testing changes made to the scripts. _________________ "...LeoDraco is a pompus git..." -- Mandrake
|
|
Back to top |
|
|
valderman Mage
Joined: 29 Aug 2002 Posts: 334 Location: Gothenburg, Sweden
|
Posted: Mon Jul 28, 2003 7:11 pm Post subject: |
[quote] |
|
I guess I misunderstood you a bit; I thought you meant to parse the high-level language as you go along, not parse it at startup. Of course, compiling at startup wouldn't be very much slower than having the scripts pre-compiled. Compiling the scripts at load-time, and thus making them readable to anyone with a text editor, is (IMO) not as good, since it would be far too easy for users to cheat. Of course, compilers and other dev tools should be released along with the game anyway - they won't make the cheating very much easier.
Quote: | Um, sorry if I sound stupid, but I've never understood why anyone in their right mind would want to create a scripting system that just apes assembly. I've never found assembly the least bit easy to code in and it seems to me like hardcoding your gameplay mechanics would be easier than having to write them in an assembly knockoff... am I missing something here? | Naturally, you'll have to create some kind of high-level language for your scripting system to be nice to work with, but you'll have to translate it to something more machine-friendly before executing it, which means bytecode. Almost every programming language in existence works on this basis; C, C++, C#, Java - they all compile into either machinecode or, in the case of Java and C#, bytecode which compiles into machinecode at load-time. The only notable difference is the BASIC family, which runs interpreted, but even these languages can be compiled nowadays. _________________ http://www.weeaboo.se
|
|
Back to top |
|
|
LeoDraco Demon Hunter
Joined: 24 Jun 2003 Posts: 584 Location: Riverside, South Cali
|
Posted: Tue Jul 29, 2003 12:24 am Post subject: |
[quote] |
|
Valderman wrote: | I guess I misunderstood you a bit; I thought you meant to parse the high-level language as you go along, not parse it at startup. Of course, compiling at startup wouldn't be very much slower than having the scripts pre-compiled. Compiling the scripts at load-time, and thus making them readable to anyone with a text editor, is (IMO) not as good, since it would be far too easy for users to cheat. Of course, compilers and other dev tools should be released along with the game anyway - they won't make the cheating very much easier. |
Eh. It really just depends upon the level of security and game-world extensibility that you are going for. I find that strategy games (RTS and TBS) generally benefit from an easy to modify engine, as such an engine seems to extend the longevity and popularity of the program.
My emphasis was to load the script file -- and create the syntactic/semantic tree representation thereof -- during some loading phase; the script would be loaded and semantically checked. Then, all the engine would have to do is take the root node of the semantic tree, instruct that node to generate (or execute, etc.), and the scene (or whatever) that the script describes would then execute. It is very simple to code, and it is relatively elegant.
Sure, for a standard RPG engine, where you want to "hide" the scripting language from the user, this might not be exactly the best approach. However, there are ways in which even that can be worked around: you could encrypt your script files (or compress them), which in itself would be a minor deterent to the curious.
Valderman wrote: | Quote: | Um, sorry if I sound stupid, but I've never understood why anyone in their right mind would want to create a scripting system that just apes assembly. I've never found assembly the least bit easy to code in and it seems to me like hardcoding your gameplay mechanics would be easier than having to write them in an assembly knockoff... am I missing something here? | Naturally, you'll have to create some kind of high-level language for your scripting system to be nice to work with, but you'll have to translate it to something more machine-friendly before executing it, which means bytecode. Almost every programming language in existence works on this basis; C, C++, C#, Java - they all compile into either machinecode or, in the case of Java and C#, bytecode which compiles into machinecode at load-time. The only notable difference is the BASIC family, which runs interpreted, but even these languages can be compiled nowadays. |
Bytecode isn't the be-all-end-all of script translation. You can perfectly well execute a script from its parse-tree representation. Sure, this is a translation of the script into a "machine-friendly" form, but it isn't bytecode (in the normative sense). Due to Moore's Law, computer architecture is increasing in processing power every 18 months or so. Low-ended machines are not what they used to be. It seems rather silly to me to create VMs for programs to increase the speed of those programs under these conditions. Especially when you consider the amount of computer cycles your program is going to be wasting sitting there waiting for user input, or for vertical retrace on the monitor to blit image data. _________________ "...LeoDraco is a pompus git..." -- Mandrake
|
|
Back to top |
|
|
|
Page 1 of 3 |
All times are GMT Goto page 1, 2, 3 Next
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|