Processing math: 100%

Monday, May 25, 2015

Assembly Simulator

Assembly Simulator

by Juan Pablo Vega

 

Problem Statement

 The Assembly Simulator is... actually an assembly emulator for an imaginary processor. A processor is an operation crunching unit consisting of an Arithmetic-Logic-Unit, a register to store comparisons, and has a n-byte addressable memory. An assembly language is low-level programming language designed for a specific processor and with a limited instruction set. In this problem, each instruction has a specific structure: 

[label] command operand[s]

The first element, [label], is an optional identifier for the specific line of the instruction in case of a program branch. The second element, command, is the instruction per se defining either an arithmetic or logical operation on specific operands. The third element, operand[s] is a set of operands with which the previous instruction must be executed.
The full set of instruction is given in the image below: 


The objective is to write a code that emulates a processor able to execute a short program based on the previous assembly language. The input is the (useless) size for the memory S and a list of ordered instructions I to execute. The output is the result of any PRINT command.

 

Hardware Architecture: Lists Everywhere


 Let's first define the architecture we want to emulate. As we already stated, there is a n-byte memory, a register and a program. And yes, lists are the easiest data structure in Python so we are going to use them to define our hardware. In addition to those three lists, we will add a variable index that points to the next instruction to be executed, called the program counter or PC.

memory   = [0 for k in range(256)]
register = [False for k in range(6)]
program  = list()
PC   = 0 
 
The elements of memory are the memory content, the elements of register are flags corresponding to different results of the comparison and the elements of program are the instructions of the program to be executed.

 

 

Fecth-Decode-Execute: A Short Loop


 Let's assume for a moment that we have loaded our program into program. We would have that each instruction is represented by a list of three elements: an optional label (replaced by a white space when not used), a string naming a specific command, and a string representing any valid set of operands.

 For instance:

loop PRINT A5,AF

 would be represented by

['loop', 'PRINT', 'A5,AF'].

Now, suppose that we define each and every command as a Python function that accepts a single list as argument. Said functions would be responsible to decode the content of the set of operands, based on the definition of the command they each represent, and then execute the command. We could then link each function to its name in string format with a hash-table to enable the command itself to be decoded.
As a result of the previous program structure, we would define the main loop a Fetch-Decode-Execute loop like this: 

while PC < len(program):
    instruction = program[PC]
    PC += 1
    ISA[instruction[1]](instruction[2])
 
where the first line establishes the loop, the second line corresponds to the Fetch stage, the third line is the increment of the PC, and the fourth line implements the Decode stage for the commands. ISA[i](j) are calls to command functions, which perform the Decode stage for the operands and then the Execute stage.

The hash-table ISA is defined below:
 
ISA = { 'PRINT' :Print, \
 'MOVE' :Move, \
 'ADD' :Add,  \
 'SUB' :Sub,  \
 'AND' :And,  \
 'OR' :Or,    \
 'XOR' :Xor,  \
 'COMP' :Comp,  \
 'BEQ' :Beq,  \
 'BNE' :Bne,   \
 'BGT' :Bgt,  \
 'BLT' :Blt,  \
 'BGE' :Bge,  \
 'BLE' :Ble  }

Instruction Set Architecture: Easy Parsing



 Before we travel through the meanders of each command function, we need to define two helper functions. We need to be able to going back and forth between string format and integer representation for hexadecimal values. Indeed, we need to decode those values from the operands, where they are written in string format, and use them either as constants or as memory addresses.

Python makes it easy to implement both functions as shown below:
def hextoint(x):
    return int(x,16)

def inttohex(x):
    return hex(x)[2:].upper()
 
We are ready to define the last big piece of the puzzle: the command functions. We will first define PRINT to introduce operands parsing and memory management, as well as the output to our program. Then, we'll move on to MOVE where we take operands parsing and memory management to another level. Finally, we'll define how COMP works and how branching instructions like BEQ operates. Extending The functions described below gives a full definition of the emulator.

 

Output Operation: PRINT


 Following the definition of the command PRINT, we first need to split the set of operands in at least one hexadecimal value. If there is only one operand, we only print the memory content at the location specified by the operand. If two operands are present, we print al memory content between the location specified by the first operand and the location specified by the second one, both included.

Notice the use of hextoint and inttohex and the way memory is addressed.

def Print(x):
    x = x.split(',')
 
    t = inttohex(memory[hextoint(x[0])])
    if len(x) > 1:
        for k in range(hextoint(x[0])+1, hextoint(x[1])+1):
            t = t + ' ' + inttohex(memory[k])
 
    print(t)
    return 

 

Sequential Operations: MOVE


 In PRINT, the operands where direct memory addresses. It is not the case for all commands. Following the definition of MOVE, we first need to split the set of operands in two hexadecimal values. We then classify each of them in one of three categories: constants, direct memory addresses and indirect memory addresses. And operand belongs to the first category if its first character is a # symbol, it belongs to the third category if it is delimited by ( and ), and it belongs to the second category when the value is let alone.
Notice the use of hextoint and inttohex and the way memory is addressed.

def Move(x):
    global memory
 
    x = x.split(',')
 
    if (x[0][0] == '#'):
        d = hextoint(x[0][1:])
    elif (x[0][0] == '('):
        d = memory[memory[hextoint(x[0][1:-1])]]
    else:
        d = memory[hextoint(x[0])]
    
    if (x[1][0] == '('):
        memory[memory[hextoint(x[1][1:-1])]] = d
    else:
        memory[hextoint(x[1])] = d   
    return

 

Branching Operations: COMP and BEQ


 The branching behaviour is split in three steps. First we compare two operands and store the result in our register. We then evaluate if a particular condition has occurred to enable the jump operation. Finally the PC is updated properly. 

The first part is similar to PRINT and MOCE and is defined in COMP. Following its definition, we first split the set of operands in two hexadecimal values. After proper parsing, we store the result of the comparison.
Notice the meaning of each element of register.

def Comp(x):
    global register
 
    x = x.split(',')
 
    if (x[0][0] == '#'):
        A1 = hextoint(x[0][1:])
    else:
        A1 = memory[hextoint(x[0])]
  
    A2 = memory[hextoint(x[1])]
 
    register = [A1 == A2, \
                A1 != A2, \
                A1 >  A2, \
                A1 <  A2, \
                A1 >= A2, \
                A1 <= A2]
    return 
 
The second part is as easy as it seems. It is a logical predication, followed by a condition function call. The index of register[i] that is tested changes depending of the condition, but jump is always the function called.
BEQ is one such predication, as shown below: 
def Beq(x):
    if register[0]:
        jump(x)
    return 
 
Given that the branching destination is always expressed as a label, we need a way to decode the operand from a specific label to a valid memory address. To do so we recall that the first element of each program line is the label. We then (arbitrarily) search for the last occurrence of the label given in the program, and use this line as the program counter update value.
Notice that this update is performed after the automatic update of the main loop.
def jump(x):
    global PC
 
    for k in range(len(program)):
        if program[k][0] == x:
            PC = k
    return

 

The Code


The complete code that solves the challenge is given below:
# Assembler Simulator, by Juan Pablo Vega
# IEEEXtreme 8.0 (2014)

def hextoint(x):
    return int(x,16)

def inttohex(x):
    return hex(x)[2:].upper()

def tobyte(x):
    return x&255

def Print(x):
    x = x.split(',')
 
    t = inttohex(memory[hextoint(x[0])])
    if len(x) > 1:
        for k in range(hextoint(x[0])+1, hextoint(x[1])+1):
            t = t + ' ' + inttohex(memory[k])
 
    print(t)
    return

def Move(x):
    global memory
 
    x = x.split(',')
 
    if (x[0][0] == '#'):
        d = hextoint(x[0][1:])
    elif (x[0][0] == '('):
        d = memory[memory[hextoint(x[0][1:-1])]]
    else:
        d = memory[hextoint(x[0])]
    
    if (x[1][0] == '('):
        memory[memory[hextoint(x[1][1:-1])]] = d
    else:
        memory[hextoint(x[1])] = d   
    return

def Add(x):
    global memory
    x = x.split(',')
    if (x[0][0] == '#'):
        A1 = hextoint(x[0][1:])
    else:
        A1 = memory[hextoint(x[0])]
    A2 = memory[hextoint(x[1])]
    memory[hextoint(x[1])] = tobyte(A2 + A1)
    return

def Sub(x):
    global memory
    x = x.split(',')
    if (x[0][0] == '#'):
        A1 = hextoint(x[0][1:])
    else:
        A1 = memory[hextoint(x[0])]
    A2 = memory[hextoint(x[1])]
    memory[hextoint(x[1])] = tobyte(tobyte(A2) + 256 - tobyte(A1))
    return
    
def And(x):
    global memory
    x = x.split(',')
    if (x[0][0] == '#'):
        A1 = hextoint(x[0][1:])
    else:
        A1 = memory[hextoint(x[0])]
    A2 = memory[hextoint(x[1])]
    memory[hextoint(x[1])] = tobyte(A2 & A1)
    return

def Or(x):
    global memory
    x = x.split(',')
    if (x[0][0] == '#'):
        A1 = hextoint(x[0][1:])
    else:
        A1 = memory[hextoint(x[0])]
    A2 = memory[hextoint(x[1])]
    memory[hextoint(x[1])] = tobyte(A2 | A1)
    return

def Xor(x):
    global memory
    x = x.split(',')
    if (x[0][0] == '#'):
        A1 = hextoint(x[0][1:])
    else:
        A1 = memory[hextoint(x[0])]
    A2 = memory[hextoint(x[1])]
    memory[hextoint(x[1])] = tobyte(A2 ^ A1)
    return
    
def Comp(x):
    global register
    x = x.split(',')
    if (x[0][0] == '#'):
        A1 = hextoint(x[0][1:])
    else:
        A1 = memory[hextoint(x[0])]
    A2 = memory[hextoint(x[1])]
    register = [A1 == A2, \
                A1 != A2, \
                A1 >  A2, \
                A1 <  A2, \
                A1 >= A2, \
                A1 <= A2]
    return

def Beq(x):
    if register[0]:
        jump(x)
    return
    
def Bne(x):
    if register[1]:
        jump(x)
    return

def Bgt(x):
    if register[2]:
        jump(x)
    return
    
def Blt(x):
    if register[3]:
        jump(x)
    return
    
def Bge(x):
    if register[4]:
        jump(x)
    return
    
def Ble(x):
    if register[5]:
        jump(x)
    return
    
def jump(x):
    global PC
    for k in range(len(program)):
        if program[k][0] == x:
            PC = k
    return
 
memory   = [0 for k in range(256)]
register = [False for k in range(6)]
program  = list()
PC   = 0 
ISA      = {'PRINT':Print, 'MOVE':Move, 'ADD':Add, \
            'SUB':Sub, 'AND':And, 'OR':Or, 'XOR':Xor, \
            'COMP':Comp, 'BEQ':Beq, 'BNE':Bne, \
            'BGT':Bgt, 'BLT':Blt, 'BGE':Bge, 'BLE':Ble}

input()

EOF = False; 
while not EOF:
    try:
        inputLine = input().split()
  
        if len(inputLine) == 2:
            line = list(' ')
            line.extend(inputLine)
        else:
            line = inputLine
        
        program.append(line) 
    except EOFError:
        EOF = True;

while PC < len(program):
    instruction = program[PC]
    PC += 1
    ISA[instruction[1]](instruction[2])  

 

Files


This drive folder contains:
  • The solution given in this article
  • Test cases
  • An alternative solution by Juan I Carrano

The Pipeline

The Pipeline

by Juan Pablo Vega

 

Problem Statement 


The Pipeline is searching problem. It is quite conventional. There is a grid map filled with cost values. There is a start and a goal. Nothing really complicated huh? Well, there are some particularities. First, the map is a square grid. Second, there is no one start nor one goal. The search can start from any row of the first column and finish at any row of the last column. Furthermore, one cannot move to the left, only up-down-right steps are allowed. Oh, did I mention that the map can get really huge and that you have limited time to find the optimal path? Fortunately, we are only asked for the cost of the optimal solution. 

 

 Think Twice Before You Start 


One might jump in and try to solve it with all the conventional tools one might have. BFS, DFS, A*, Dijkstra, and many more, are possible methods that could cross your mind. But implementing any of them would be in vain. Simply take a step back and think about the map. It has a known structure: it's a square. It has known dynamics: one can't go left. And most importantly, all the content (cost and constraints) are defined in a descriptive manner, and are not subject to any kind of variation in any way. 

 

Turn It Upside Down 

So, where do I start? From the beginning right? Wrong! Starting from there has many disadvantages. First, you don't know which row is the correct start point and you are going to pay the price of additional, useless computing. Second, and most important you are going to repeat many steps of your search. Ok, what if I start from some sort of optimal internal point? Well, for any internal point there will be a portion of the map that remains on the right, for which it will act as a starting point. And we are back to the disadvantages I mentioned above. Ok, ok, from the last column then. Exactly, but why? Well, if you start from the leaves, you might eliminated once and for all a dead branch. At every level (column) you choose to discard dead ending paths. And this process is incremental, meaning that your only use past pre-processed plus new non-processed information. 

 

 Compute Only Once: Dynamic Programming


That seems interesting right? It is called dynamic programming. It is a method for solving complex problems by first solving simpler sub-problems only once, and then combining all optimal sub-solutions to give the optimal global solution. For those who have never heard of it, stay put and appreciate the magic of it. 

 

 Let's Get Started


Let's assume that the grid map can be found in city and that cost is a copy of the grid map without any content. We will use cost as a structure containing the backward-incremental costs, i.e. the optimal sub-solutions. Let's cut the problem in half. The first sub-problem only permits us to move down-right, while the second, up-right. With these two new problems, the problem gets much easier. Finally, it's important to recognize that the last column of city already states the optimal sub-solutions. Indeed, for every column of the map we have reach the end, and there is no use of going up or down. Thus the last column of city can be copied to the last column of cost.

 

Solving down-right Upside Down


Let's say that we are at a given column j and that we have already found all sub-solutions for n-1, n, ..., j+1. This means that all the information we need to find the sub-solution at j can be found in the j+1th column of cost and in the jth column of city. What are the possible moves from the last row, n? Only right. We then update cost with this sub-solutions. When we analyse row n-1, we find that cost[n][j] and cost[n-1][j+1] are both optimal sub-solutions for the down-right problem. It is easy then to find the sub-solution for row n-1. Similarly, we can find the solutions for any row of column j up to the first.

 

Solving up-right Upside Down


We could follow the same procedure for this sub-problem. But we would soon find that there is a certain degree of overlap. Well, in down-right, we already considered the case where going down and going right were the optimal sub-solution. And the result was saved in cost. Now, in up-right we are considering going right again. It is of no use if the values of cost are shared among the sub-problems. That way, we should do as follows. From row 1 sequentially until row n-1 we would compare the possibility of going up against the optimal sub-solutions of the previous sub-problem.

 

From Right To Left


Repeating down-right and then up-right for every column, starting from column n-1 up until column 0 represents the process of combining optimal sub-solutions to fin the optimal global solution. The last step, once the first column have been processed, is to find the minimum accumulated cost, located somewhere in the first column of cost

 

 Sample Code


Here is a snippet of the search block of the program:
# The Pipeline, by Juan Pablo Vega | IEEEXtreme 8.0 (2014)
for j in range(n-2,-1,-1):
    for i in (range(n-1,-1,-1)):
        if i < (n-1):
            cost[i][j] = min(city[i][j]+cost[i][j+1], city[i][j]+cost[i+1][j])
        else:
            cost[i][j] = city[i][j]+cost[i][j+1]
    for i in range(n):
        if i > 0:
            cost[i][j] = min(cost[i][j], cost[i-1][j]+city[i][j])

 

Files


The solution is available at this drive folder.

Wednesday, November 19, 2014

Run Me

Run Me - IEEE Xplore problem

The Problem

 

This problem is hard. Very hard. In fact, we couldn't solve it during the competition, but we didn't give up and finally we were able to solve it... one day after the competition ended (tough luck).

Problem statement 

 

You are given a hex dump of a MS-DOS program, that is 8086 machine code. Of course, you are not told what does the program do. You must then submit your own program that does exactly the same as the 8086 code. To make things harder, whatever the 8086 program does, it does it inefficiently. Your submission must run within the alloted time.

Here is the machine code:

BF 00 04 BE C0 00 56 31 C9 B4 00 CD 16 3C 2E AA
E0 F7 F7 D1 29 D2 89 CD 5B 53 FE 07 75 03 43 EB
F9 BF 00 02 89 F9 89 F8 F3 AA 89 FE AC 89 C3 FE
07 80 FB 2E 75 F6 FE 0F 5E 56 89 E9 AC 89 C3 FE
0F 7C D5 E2 F7 42 5E 56 89 E9 F3 A6 75 CA 5D 92
D4 0A E8 00 00 86 C4 04 30 CD 29 C3

Pretty scaring, isn't it?
We are told that this programs accepts a string of characters terminated with a dot ("."). We are not told anything about the output, except it is terminated with a newline.

The disassembly 

 

Before we try and solve this problem, it would be nice to have a disassembly of the above code:

;======================================
; IEEE Xtreme 8.0 (2014)
; IEEE Xplore Run-me Problem
;======================================
code segment
  assume cs:code,ds:code,es:nothing,ss:code

  org 100h 
ip  label near

start: MOV DI,0400h
  MOV SI,00C0h                            
  PUSH SI                                 
  XOR CX,CX                              
L003: MOV AH,00h                              
  INT 16h                                 
  CMP AL,2Eh                              
  STOSB                                    
  LOOPNZ L003                               
  NOT CX                                 
  SUB DX,DX                              
  MOV BP,CX                              
L005: POP BX                                 
  PUSH BX                                 
L001: INC BYTE PTR [BX]                      
  JNZ L002                                
  INC BX                                 
  JMP L001                               
L002: MOV DI,0200h                            
  MOV CX,DI                              
  MOV AX,DI                              
  DS:REPZ                                    
  STOSB                                    
  MOV SI,DI     
L004: LODSB                                    
  MOV BX,AX                              
  INC BYTE PTR [BX]                      
  CMP BL,2Eh                              
  JNZ L004                               
  DEC BYTE PTR [BX]                      
  POP SI                                 
  PUSH SI                                 
  MOV CX,BP                              
L006: LODSB                                    
  MOV BX,AX                              
  DEC BYTE PTR [BX]                      
  JL  L005                               
  LOOP L006                               
  INC DX                                 
  POP SI                                 
  PUSH SI                                 
  MOV CX,BP                              
  DS:REPZ                                    
  CMPSB                                    
  JNZ L005                               
  POP BP                                 
  XCHG DX,AX                              
  AAM                                    
  CALL L007                               
L007: XCHG AL,AH                              
  ADD AL,30h                              
  INT 29h                                 
  RET                                    
end_code label near
code  ends
  end ip

You can download it here: Run-me.asm

 Solution


The Tedious Way


After hours of reading over and over the same x86 assembly file we came out with this monolithic and self-explanatory asm files.
The first one, called "first-step.asm" is a line-by-line commented assembly file with an introduction for non assembly users:
;======================================
; IEEE Xtreme 8.0 (2014)
; IEEE Xplore Run-me Problem
;======================================


code    segment
        assume  cs:code,ds:code,es:nothing,ss:code

        org 100h
ip      label   near
; The code above is only there because its a *.com
; file (program starts at 0x0100)

; This comments are intended as a firs approach to
; understanding this code.

; A "C Style" pseudo-code will be used to explain
; at the end of each line.

; Initial considerations:
;
;   CX:
;       is a 16-bit register
;       In this program, CX is used mainly as a counter
;
;   AX:
;       It's a 16-bit register used for operations.
;       In this program, AX is used to read/store
;       strings to from/to memory, read input from
;       keyboard and print on screen data.
;
;   AH is the most significant Byte of AX
;   AL is the least significant Byte of AX
;
;   BX:
;       It's a 16-bit register used for operations.
;       In this program, BX is used as a pointer
;
;   BH is the most significant Byte of BX
;   BL is the least significant Byte of BX
;
;   DI and SI:
;       DI and SI are used as pointers for writing/
;       reading strings from memory.
;       SI is the Source.
;       DI is the Destiny.
;
;   Flags
;       ZF is the Zero flag. It's turned on when
;           an operation of compare, decremet,
;           increment or any other operation that
;           sets flags returns 0.
;
;   Stack
;       TOS (Top Of Stack) is the top of stack :)
;
;   BP is a register used as a variable.


start:  MOV DI,0400h        ; DI = 0x0400;
        MOV SI,00C0h        ; SI = 0x00C0;
        PUSH SI             ; TOS = SI; (TOS=0x00C0)
        XOR CX,CX           ; CX ^= CX; (this sets CX=0)
L003:   MOV AH,00h                  ; AH = 0x00;
        INT 16h             ; AL = GETCHAR();
        CMP AL,2Eh          ; ZF = ((AL-0x2E)==0);
        STOSB               ; *(DI++) = AL;
        LOOPNZ  L003        ; CX--; IF((CX!=0)&&(ZL==FALSE)) GOTO L003;
        NOT CX              ; CX = ~CX;
        SUB DX,DX           ; DX -= DX  (sets DX=0 and ZF=1)
        MOV BP,CX           ; BP = CX;
L005:   POP BX              ; BX = TOS;
        PUSH BX             ; TOS = BX;
L001:   INC BYTE PTR [BX]   ; (*BX)++; (if (*BX) is initially 0xFF, this operation will turn ZF=1)
        JNZ L002            ; IF(ZF==FALSE) GOTO L002;
        INC BX              ; BX++;
        JMP L001            ; GOTO L001;
L002:   MOV DI,0200h        ; DI = 0x0200;
        MOV CX,DI           ; CX = DI; (CX = 0x0200)
        MOV AX,DI           ; AX = DI; (AX = 0x0200)
        DS:REP              ; REP runs the next string operation in a loop
        STOSB               ; DO{ CX--; *(DI++)=AL; }WHILE((CX!=0))
        MOV SI,DI           ; SI = DI;
L004:   LODSB               ; AL = *(SI++);
        MOV BX,AX           ; BX = AX;
        INC BYTE PTR [BX]   ; *(BX)++;
        CMP BL,2Eh          ; ZF=((BL-0x2E)==0);
        JNZ L004            ; IF(ZF==FALSE) GOTO L004;
        DEC BYTE PTR [BX]   ; *(BX)--;
        POP SI              ; SI = TOS;
        PUSH    SI          ; TOS = SI;
        MOV CX,BP           ; CX = BP;
L006:   LODSB               ; AL = *(SI++);
        MOV BX,AX           ; BX = AX;
        DEC BYTE PTR [BX]   ; *(BX)--;
        JL      L005        ; JL means Jump if Less (in this case I'll just
                            ; use: IF(*(BX)<0) GOTO L005; )
        LOOP    L006        ; CX--; if(CX!=0) GOTO L006;
        INC DX              ; DX++;
        POP SI              ; SI = TOS;
        PUSH    SI          ; TOS = SI;
        MOV CX,BP           ; CX = BP;
        DS:REPZ             ; REPZ runs the next string operation in a loop
        CMPSB               ; DO{ CX--; ZF=((*(SI++)-*(DI++))==0); }WHILE((CX!=0)&&(ZF==TRUE))
        JNZ L005            ; IF(ZF==FALSE) GOTO L005;
        POP BP              ; BP = TOS;
        XCHG    DX,AX       ; SWAP(DX, AX);
        AAM                 ; AH = AL/10; AL = AL%10;
        CALL    L007        ; L007();
L007:   XCHG    AL,AH       ; SWAP(AL, AH);
        ADD AL,30h          ; AL+=0x30;
        INT 29h             ; PUTCHAR(AL);
        RET                 ; RETURN;

end_code    label   near
code        ends
        end ip
You can download it here:  first-step.asm

The second file, called "second-step.asm" goes one step further and groups the assembly instructions into easy-to-understand blocks:
;======================================
; IEEE Xtreme 8.0 (2014)
; IEEE Xplore Run-me Problem
; Second Step - Identifying code groups
;======================================

;   The second step to solve this problem is finding code groups
;   and identifying it's purpose.

;-------------------------------------------------------------------------
;   Observations:
;       The program uses POP and PUSH one after each other. This means that
;       the TOS (Top Of Stack) is used as a variable for holding the value
;       0x00C0.
;       For example: (lines 34 & 35)
;           Initially, the TOS value is 0x00C0.
;
;               POP SI              ; SI = TOS;
;           This means that SI = 0x00C0; and the TOS will now be the next
;           value on the stack;
;               
;               PUSH    SI          ; TOS = SI;
;           The stack is re-filled with the value 0x00C0. In the end, TOS
;           remained unchanged.
;
;           In the end, the stack remained unchanged and SI = 0x00C0
;-------------------------------------------------------------------------



;-------------------------------------------------------------------------
;   String Input:
;       If we look carefully, the first 10 assembly lines are intended to
;       receive a string from keyboard with a terminator character '.'
;-------------------------------------------------------------------------

start:  MOV DI,0400h        ; DI = 0x0400;
                            ; The destiny pointer is 0x0400 and the string
                            ; will be stored there.
                            
        MOV SI,00C0h        ; SI = 0x00C0;           
        PUSH SI             ; TOS = SI; (TOS=0x00C0)           
                            ; This is not used for string storage. Its used
                            ; to initialize TOS as 0x00C0 (see Observations above).
                                
        XOR CX,CX           ; CX ^= CX; (this sets CX=0)
        
L003:   MOV AH,00h          ; AH = 0x00;
        INT 16h             ; AL = GETCHAR();
        CMP AL,2Eh          ; ZF = ((AL-0x2E)==0); 
                            ;   0x2E is the ASCII value for character '.'
        STOSB               ; *(DI++) = AL;
                            ;   Store AL value on 0x0400 initially and then on the
                            ;   next position and so.
                            ;   ZF is not modified by this instruction                          
        LOOPNZ  L003        ; CX--; IF((CX!=0)&&(ZF==FALSE)) GOTO L003;
                            ;
                            ;   CX--; is not an instruction. it's a representation
                            ;   of what LOOPNZ does and it does NOT modify ZF.
                            ;
                            ;   first comparison (CX!=0):
                            ;   CX started as 0. first it's decremented and then
                            ;   it's compared to be different than 0.
                            ;   The first comparison made is 0xFFFF!=0.
                            ;   This means that a maximum of 0xFFFF characters 
                            ;   can be taken as input before breaking the loop 
                            ;   due to the (CX!=0) comparison. (this will never happen :))
                            ;
                            ;   Second comparison (ZF==FALSE):
                            ;   As mentioned before, the instruction STOSB does NOT
                            ;   modify the ZF (Zero Flag) value.
                            ;   This means that this comparison can be replaced by the 
                            ;   instruction CMP AL,2Eh.
                            ;   So: ((AL-0x2E)==0)==FALSE => ((AL==0x2E))==FALSE
                            ;           => AL!=0x2E  (or AL!='.')
                            ;
                            ;   
                            ;   
        NOT CX              ; CX = ~CX;
                            ; CX started as 0x0000. Then, CX was decremented on each loop
                            ; until the input was '.'
                            ; In the end, CX = 0x0000 - Number of inputs
                            ; so CX is MINUS (the number of characters on the string plus one 
                            ; (because of the terminator character));
                            ;
                            ; On signed binary arithmetic, to turn a number to it's negative
                            ; value, 2's complement must be applied. This can be done by 
                            ; applying a NOT to every bit and then adding one.
                            ; We can say that:
                            ;       -A = NOT(A) + 1;
                            ; or in this case:
                            ;       NOT(A) = -A - 1;
                            ;
                            ; So when we do CX = ~CX, whe are doing:
                            ;       CX = NOT(CX) = -CX - 1;
                            ; Then:
                            ;       CX = -( -(number_of_characters_on_string + 1) ) -1
                            ;       CX = number_of_characters_on_string +1 -1
                            ;       CX = number_of_characters_on_string; 
; This code group can be re-written as :
;
;   char* p2memory = 0x0400;
;   uint16 n_characters = 0;
;   do{
;       AL = getchar();
;       *(p2memory++) = AL;
;       n_characters++;
;   }while(AL!='.');
;   n_characters--;
;


;-------------------------------------------------------------------------
;   Reverse "Big Num" Counter:
;   The lines below are used to create a "Big Num" counter.     
;   The counter goes from left to right and can grow undefined.
;   Like this:
;===============================================================
;  0x00C0  0x00C1  0x00C2  0x00C3  0x00C4  0x00C5  0x00C6   ...
;===============================================================
;   0x00    0x00    0x00    0x00    0x00    0x00    0x00    ...
;   0x01    0x00    0x00    0x00    0x00    0x00    0x00    ...
;   0x02    0x00    0x00    0x00    0x00    0x00    0x00    ...
;    .       .       .       .       .       .       .
;    .       .       .       .       .       .       .
;    .       .       .       .       .       .       .
;   0xFF    0x00    0x00    0x00    0x00    0x00    0x00    ...
;   0x00    0x01    0x00    0x00    0x00    0x00    0x00    ...
;    .       .       .       .       .       .       .
;    .       .       .       .       .       .       .
;    .       .       .       .       .       .       .
;   0xFF    0xFF    0x00    0x00    0x00    0x00    0x00    ...
;   0x00    0x00    0x01    0x00    0x00    0x00    0x00    ...
;    .       .       .       .       .       .       .
;    .       .       .       .       .       .       .
;    .       .       .       .       .       .       .
;   0xFF    0xFF    0xFF    0xFF    0xFF    0xFF    0xFF    ...
;-------------------------------------------------------------------------

        SUB DX,DX           ; DX -= DX  (sets DX=0 and ZF=1) 
        MOV BP,CX           ; BP = CX;  BP is only used to store the number of chars
                            ;           obtained in the "String Input" section.
                            
L005:   POP BX              ;
        PUSH BX             ; BX = 0x00C0 (see Observations)
        
L001:   INC BYTE PTR [BX]   ; (*BX)++;  (if (*BX) is initially 0xFF, 
                            ; this operation will turn ZF=1)
        JNZ L002            ; IF(ZF==FALSE) GOTO L002;
        INC BX              ; BX++;             
        JMP L001            ; GOTO L001;                   
                            ;
                            ; This is a simple counter:
                            ; - Increment *(BX)  (which is 0x00C0 initially)
                            ; - If (ZF==FALSE) continue with the rest of the code
                            ;   (didn't went from 0xFF to 0x00)
                            ; - If (ZF==TRUE) increment next memory position.
                            ;   and loop.
;
; This code group can be re-written as :
;   char *BX = 0x00C0;
;   while((*BX)++ == 0xFF)
;       BX++;
;
                            


;-------------------------------------------------------------------------
;   Memset:
;   The lines below are used to clean the memory positions from             
;   0x0200 to 0x03FF. Like a memset ( 0x0200, 0, 0x1FF );
;+=========================================================+
;|  0x0200  0x0201  0x0202  0x0203  0x0204   ...   0x03FF  |
;+=========================================================+
;|   0x00    0x00    0x00    0x00    0x00    0x00   0x00   |
;+---------------------------------------------------------+
;-------------------------------------------------------------------------                      
L002:   MOV DI,0200h        ; DI = 0x0200;  Set the destiny pointer to 0x0200         
        MOV CX,DI           ; CX = DI; (CX = 0x0200) count = 200     
        MOV AX,DI           ; AX = DI; (AX = 0x0200) AL will be stored so 
                            ;   this means that it will memset 0x00.
                            ;
                            ; The 0x02 value assigned to AH is very important
                            ; and it will be used along the code. (keep that in mind)
                            ;
        DS:REP              ; REP runs the next string operation in a loop
        STOSB               ; DO{ CX--; *(DI++)=AL; }WHILE((CX!=0))
                            ;
; This code group is REALLY simple so no explanation is needed.
; This code group can be re-written as :
;   AH = 0x02;  //(VERY IMPORTANT FOR THE REST OF THE CODE)
;   memset(0x0200, 0, 0x1FF);       


;-------------------------------------------------------------------------
;   Character counter:
;   The lines below are used to "count" the number of occurrences of 
;   each character in the string.
;   The memory section from addresses 0x0200 to 0x03FF are used as 
;   variables to storage the number of times each possible character is
;   present in the input string. (with the exception of character '.')
;   For example:
;   The position 0x0230 holds the number of times the character '0' 
;   (ASCII value 0x30) appears in the input string.
;-------------------------------------------------------------------------                      

            
        MOV SI,DI           ; SI = DI; 
                            ; DI was 0x0400 from last code group and it 
                            ; was pointing to the first character of the
                            ; string.
L004:   LODSB               ; AL = *(SI++); 
                            ; load a character of the string.
                            ; AL now holds the ASCII value of    a character
        MOV BX,AX           ; BX = AX;
                            ; If you kept in mind that AH contains 0x02, you'll 
                            ; see that BX is pointing to 0x02AL being AL the 
                            ; ASCII value of a character from the string.                           
        INC BYTE PTR [BX]   ; *(BX)++;
                            ; Increment the value of the memory position 0X02AL
                            ; (incrementing the counter for the character AL)
        CMP BL,2Eh          ; ZF=((BL-0x2E)==0); 
                            ; BL holds the ASCII value of a character from the 
                            ; string. This comparison is checking if we reached
                            ; the string terminator 0x2E ('.').
        JNZ L004            ; IF(ZF==FALSE) GOTO L004;
                            ; If not(end of string) loop.
                            ;
                            ; The following line is executed after the end of string
                            ; is reached. Notice that after that, BX will be pointing
                            ; to 0x022E (counter for the '.' character).
                            ; Take into consideration that the code BX is incremented 
                            ; BEFORE the comparison to '.' so the memory position 
                            ; 0x022E will have a 1 (showing the one and only occurrence)
                            ; of the character '.' in the string.
                            ; Knowing all this, the next line simply erases the counter
                            ; for the character '.'
        DEC BYTE PTR [BX]   ; *(BX)--;         
; This code group can be re-written as :
;   for (int i=0 ; i<n_characters ; i++)
;       *(0x0200+i)++;  
        


;-------------------------------------------------------------------------
;   Permutation Finder:
;   This code section checks if the counter bytes are a permutation of the
;   input string.
;-------------------------------------------------------------------------
        
        POP SI              ; 
        PUSH    SI          ; SI = 0x00C0 (see Observations)
        MOV CX,BP           ; CX = BP;  Load n_characters   
                            
L006:   LODSB               ; AL = *(SI++); Load a counter byte
        MOV BX,AX           ; BX = AX;  BX = 0x02AL with AL = current counter byte
        DEC BYTE PTR [BX]   ; *(BX)--;  Decrement memory location 0x02AL 
                            ; Next line will evaluate the result of this instruction.
                            ;
                            ; It's necessary to remember that the memory locations 
                            ; with address 0x02CC is either 0x00 if CC is NOT a character
                            ; present in the string; or 0xXX being XX the number of times
                            ; the character CC is indeed present in the string.
                            ;
                            ; There are two possible outcomes for the next conditional
                            ; jump:
                            ; 
                            ;   - The selected "big number" counter byte is a value in the 
                            ;     string => the memory position we just decremented had the
                            ;     number of times a character appears in the string => the
                            ;     decrement operation did not trigger a "Less" flag.
                            ;
                            ;   - The selected "big number" counter byte is NOT a value in 
                            ;     the string => the memory position we just decremented had
                            ;     0x00 before DEC operation => the decrement operation did 
                            ;     trigger a "Less" flag.
                            ;
                            ; Take into consideration that L005 is the begin of the loop,
                            ; just before the "big number" counter.
                            ;   
        JL      L005        ; JL means Jump L005 if Less 
                            ; If selected "big num" counter byte is not in the string, start again
                            ; If selected "big num" counter byte IS in the string:
                            ; Decrement CX(CX has the number of characters to find),
                            ; if CX !=0 (didn't compare to every character) keep comparing (L006)
        LOOP    L006        ; CX--; if(CX!=0) GOTO L006;                            
                            ; This point is reached if we already compared every character and
                            ; each comparison was true (each "big num" counter byte is a character
                            ; in the string). Notice that the bytes of the counter must NOT be in
                            ; order so "big num" counter is a permutation of the input string.
                            ;
        INC DX              ; DX++;     DX is the number of permutations found.
;
; This code group can be re-written as:
;   if ("big num" counter) in permutations(input string): DX++
;   
        
;-------------------------------------------------------------------------
;   "Big Num" counter limit:
;   This code group checks the value of the counter located at 0x00C0 
;   and leaves the main loop if that condition is true.
;   Note:   The end condition is a permutation of the input string, so
;   evaluating the end condition after the permutation is okay :)
;-------------------------------------------------------------------------      
        POP SI              ;                   
        PUSH    SI          ; SI = 0x00C0; (See Observations)                      
        MOV CX,BP           ; CX = n_characters;              
        DS:REPZ             ; REPZ runs the next string operation in a loop                       
        CMPSB               ; DO{ CX--; ZF=((*(SI++)-*(DI++))==0); }WHILE((CX!=0)&&(ZF==TRUE))
                            ; This two lines above compare each byte of the "big num" counter 
                            ; with each character of the input string.
                            ; 
                            ; keep looping until "big num" counter is equal to the string.
        JNZ L005            ; IF(ZF==FALSE) GOTO L005;               
                            ; 
;
; This code group can be re-written as:
;   if ("big num" counter) != (input string)
;       goto L005;
;   


;-------------------------------------------------------------------------
;   Print Output:
;   This code prints the contents of DX on the screen
;-------------------------------------------------------------------------      

        POP BP              ; BP = 0x00C0;  Why???
                            ;
        XCHG    DX,AX       ; SWAP(DX, AX); AX = DX (load permutation counter on AX)
        AAM                 ; AH = AL/10; AL = AL%10; AAM converts AX to a printable form
                            ; If DX = NM then 
                            ; AH = N 
                            ; and 
                            ; AL = M
                            ; for example:
                            ; DX = 24 -> AH = 2; AL = 4;
                            
        CALL    L007        ; L007(); This is a well known trick. Call a function at
                            ; L007, and that function's code is right beneath. So that
                            ; code will be executed twice. 

; FIRST TIME CALLED:                            
        XCHG    AL,AH       ; SWAP(AL, AH); INT 29h prints the content of AL.
                            ; this means we are displaying the first digit
                            ; (which was originally in AH)
        ADD AL,30h          ; AL+=0x30; Convert it to ASCII by adding '0'       
        INT 29h             ; PUTCHAR(AL); print it
        RET                 ; RETURN;               

; SECOND TIME:
L007:   XCHG    AL,AH       ; SWAP(AL, AH); display de second digit
        ADD AL,30h          ; AL+=0x30;     
        INT 29h             ; PUTCHAR(AL);                    
        RET                 ; RETURN;               
    
; END       

end_code    label   near
code        ends
        end ip

Download it here: second-step.asm

At this point you should have an idea of what this program is doing. If you don't then you haven't spent enough hours with these files. Just kidding, if you don't understand yet what this program does please read on to the following section.

Sunday, November 2, 2014

RoadTrip


Problem description 

 

The basic idea of the problem was the following: 

You are presented with a scenario where you have to travel a certain amount of miles in your car, and in the way there are a bunch of gas stations, each one of them with a different price. The idea is that you have to make your trip spending the least amount of cash, being careful not to run out of gas.

For the full description check: IEEE Extreme 2014

 

Input

  • N (0 < N < 50001): The total number of gas stations. 
  • F (0 < F < 1000001): The units of fuel the car can take. 
  • T (0 <= T <= F): The units of fuel the car has at the beginning of the trip. 
  • L (0 < L < 1000000001): The path length of the landmarks he plans to visit
  • Each of the following N lines will contain two integers: the first one, D_i (0 <= D_i <= L) corresponds to the distance of the station from the starting point, and the second one, C_i (1 <= C_i <= 1,000,000) represents the cost per fuel unit for that station.
Note: You may assume that the trip will be on a straight line where all gas stations are spread on this line at the positions specified by their D_i values

Basic concepts

 
  •  To try an get the best possible solution, the idea is to spend the least amount of money, making the necessary stops in they way, we don't mind the number of stops. 
  • We should not get extra gas than what we need to get to the end point (if we are left with extra gas, it means that we could've got less gas at the station and saved some cash)
  • The trip may be impossible to do, as there may be a distance between gas stations that is greater than our fuel tank (in this case, even if we fill the whole tank we will ran out of gas in between stations).
  •  We should check if we have gas at the starting point
 

The solution


Well, the solution  to this problem is pretty simple. It relies in only one idea, that is the following:

"If I need to get extra gas to perform the trip, I should always go to the cheapest station"

That is, if I can't do the trip without loading extra fuel, I should get it at the cheapest station. Even if it is not enough for the trip, I should aim at reaching to the cheapest station with zero gas, loading a full tank there and then continue with the trip.

Why get there with zero gas?
Basically, if I get there and have some fuel in my tank, I either got it from the start (it's ok) or load it somewhere else (not ok, because that means I could've got less gas earlier and load that amount here, that is cheaper)

Why should I load a full tank?
Well, if I can reach the goal with less fuel, then I should get the exact amount I need, but if that's not the case, I should get the whole tank, as anywhere else it will be more expensive.

So, what now? It's simple, just divide and conquer.

Let's imagine I only have to make one stop. I start with T units of gas in my car, get to the cheapest station load what I need there and continue to the goal, getting there with zero gas.

But, what if I need to make more stops? The problem doesn't change, you just have to solve it twice.

You choose the cheapest station (C). 

START  --------------------------- C ----------------------------------- END-GOAL
                          (1)                                    (2)

(1) I start with T units of gas in my car and my goal is C. If I can't get there directly, I get the necessary amount of gas in the cheapest station to get to my goal with an empty tank.

* I load what I need (call it Y) at C.

(2) Then I start with Y units of gas at C, and my goal is the end goal.

If you can't get to C directly you will have the following scenario:

START  ------------ C2 ------------- C ----------------------------- END-GOAL
                 (1)               (2)                         (3)

where C2 is the cheapest station between START and C.

It is simple to see that you now have a recurrence problem. There's nothing else to say, just get a NASA trained monkey to code it.



Java solution

 

JAVA CODE

I'm not proud of this code, but it had to be done, quickly, and without sleep. Don't tell my mom I used pointers in Java. I swear I never code like this.

The Grand Integer Lottery

Grand Integer Lottery (Python 3)


Problem Description


The Grand Integer Lottery is a deterministic search over a integer sequence ranging from S to E, based on a set of integers {n}=[n1, ..., nN].
Let's define the concept contiguous occurrence in string format (COSF) of a number a in a number b: we say that there is a contiguous occurrence in string format of a in b if, when both are considered as strings, a occurs as a contiguous block in b.

Examples of COSF: 3 in 13, 23 in 1234.
Examples of non-COSF: 3 in 12, 23 in 32134.

The lottery sequence is defined as the subset L of [S, E] such that for any l in L it is true that there is COSF of some integer nk in l.

Example: S=1, E=35, {n}=[3, 11], then the lottery sequence is [3, 11, 13, 23, 30, 31, 32, 33, 34, 35].

The winning number P is defined as the Pth number in the lottery sequence if there is one, otherwise there is none.

Functional programming in Python (3.x, not 2.x)


Let's suppose that we already have the input data: S, E, {n} and P.
The lottery sequence can be generated in-line:
lotseq = list(filter((lambda l: any(x >= 0 for x in [l.find(x) for x in n])), [str(x) for x in list(range(S,E+1))]))

And the output is a simple if-else evaluation:
if P <= len(lotseq):
    print(lotseq[P-1])
else:
    print('DOES NOT EXIST')

Wow! Not so fast! 


Well that was easy. Considering that no timeout appear when using this implementation, we will not study the complexity of the solution. Instead, we will focus on understanding the power of each sub-expression.
First, the initial integer sequence [S, E]:
[str(x) for x in list(range(S,E+1))]

This snippet of code creates a sequence of strings where each element an integer in [S, E] converted to string format. Simple right? Second, the COSF implementation:
l.find(x)

That is not cheating. Python find function for string actually implements COSF. Here l is an element of lotseq. Third, deciding is an element of the initial sequence is to be placed in the lottery sequence:
any(x >= 0 for x in [l.find(x) for x in n])

This is the decision rule. It predicates if there is COSF of any element of n in l. Finally, applying the decision rule for every element of the initial sequence: 
list(filter((lambda l: any(x >= 0 for x in [l.find(x) for x in n])), [str(x) for x in list(range(S,E+1))]))

We made use of a lambda expression to define an anonymous in-line functional, and then applied it to the initial in a special way. This special way called filter takes one element of the initial sequence at a time and puts it in the lottery sequence only if the decision rule is satisfied. After assigning the result of this supercalifragilisticexpialidocious function to a variable, it's just a matter of finding the correct winning number, if there is any.

The Code


 The complete code that solves the challenge is given below:
# Grand Integer Lottery, by Juan Pablo Vega
# IEEEXtreme 8.0 (2014)
S, E, P, N = [int(k) for k in raw_input().split()]

n= []
for k in range(N):
    n.append(str(int(input())))

lotseq = list(filter((lambda l: any(x >= 0 for x in [l.find(x) for x in n])), [str(x) for x in list(range(S,E+1))]))

if P <= len(lotseq):
    print(lotseq[P-1])
else:
    print('DOES NOT EXIST')

You can download it here.

Alternative solution


A slightly different code:
# Grand integer lottery. Alternative solution by Juan I. Carrano.
# IEEEXtreme 8.0 (2014)
S, E, P, N = [int(k) for k in raw_input().split()]

nn = []
for dummy in xrange(N):
    nn.append(raw_input().strip())

def vale(t):
    coso = str(t)
    return any(ns in coso for ns in  nn)

lseq = [t for t in xrange(S, E+1) if vale(t)]

try:
    print lseq[P-1]
except IndexError:
    print "DOES NOT EXIST"

Download it.

More

In this Drive folder you will find code and test cases.

Nekops-Nu Sequences

Nekops-Nu Sequences (the lazy way)

Problem statement

 

Nekops-Nu sequence


A Nekops-Nu sequence is no more than a run-length encoding (RLE). You take a sequence and for each run of identical numbers, you output the number of repetitions and the number itself.
The sequence thus generated can be itself run-length encoded. By applying the RLE recursively we can generate a sequence of sequences.

 

Input


The input consists of one line with numbers separated by spaces. The first number, N, is the number of sequences to compute (i.e, how many times we have to apply the RLE function). The remaining numbers are the starting sequence.

 

Output


First line of the output should consist of the original sequence. The second line is the RLE of the original sequence. The second line is the RLE of the RLE of the original sequence... you get it? The last line must contain the number of elements of the last sequence. The output must have N+2 lines.
 As an additional annoyance, we are told that the sequences must be center aligned, and padded with "." characters to the width of the longest one.

Original problem statement

 

This [pdf] is the problem, as it was posted in the competition.

Solution

 

RLE in python


The first step is googling "python RLE". This stackoverflow question mentions the groupby() function which (almost) does what we want:
def neknu(s):
    nk = []
    for d, l in groupby(s):
        nk.append(len(list(l)))
        nk.append(d)

    return nk

 

Centering


Another easy one. The builtin str class has a center() method that lets us specify the padding character. Not only that: if it has to add an odd number of padding characters, the left side will have one more character than the right side, just as required by the challenge:
longest = max(len(s) for s in Seqs)
print "\n".join(s.center(longest, '.') for s in Seqs)

The Code


# Nekops-Nu sequences, by Juan I. Carrano | IEEEXtreme 8.0 (2014)
from itertools import groupby

sk, p, sseq = raw_input().partition(' ')

k = int(sk)
seq0 = [int(n) for n in sseq.split()]

Seqs = [" ".join(str(s) for s in seq0)]

def neknu(s):
    nk = []
    for d, l in groupby(s):
        nk.append(len(list(l)))
        nk.append(d)

    return nk

lastl = len(seq0)

for dummy in xrange(k):
    seq0 = neknu(seq0)
    lastl = len(seq0)
    Seqs.append(" ".join(str(s) for s in seq0))

longest = max(len(s) for s in Seqs)

print "\n".join(s.center(longest, '.') for s in Seqs)
print lastl

Files


The solution, testcases and  original problem statement are available at this Drive folder.

The Heist

The Heist, aka Rob to the Big Bank, in Python

 

Problem description

The "official" description of this problem was not very clear to me, so I will try to give my own explanation.

Imagine there are N robbers. Each robber can carry up to M (an integer) units of weight, for a combined load capacity of N*M units. They open a vault that contains different types of valuable objects. Each type of object has some unit weight and unit value. You must choose how many items of each type the robbers must take in order to maximize the value.
The following rules apply:
  1. The total weight of all items the robbers take must be less than their total load capacity, M*N.
  2. The number of items of each type must be an integer, or zero.
  3. The vault has an infinite amount of  each object, so the only limiting factor is rule (1).

Mathematical description

Input variables:
  • N,M:=number of robbers, load capacity per robber
  • k:=different types of items
  • Weightk,Valuek:=Weight, Value of an item of the k-th type
Problem: Find the quantities Q=(q1,,qk)Nk0,i<=ki=1qi×WeightiM×N Such that Q maximizes i<=ki=1qi×Valuei

Input & output formats

Input

The first line contains N and M, separated by a space.
The following lines contain the definitions for each type of object: Name, unit weight, unit value, separated by commas, without spaces. Note that the original problem statement, available at Hackerrank, is not very clear about this. I used this interpretation as unit weight and value.
The last line contains the word END, in capital letters.

For example:

5 71000
Gold,34500,3000
Bronze,29000,80
Silver,14800,1200
END 

Output

For each item type there should be a line containing the name of the item, the number of items the robbers should take, and the total weight and value of those items the robbers will be taking. Fields should be separated by commas, with no spaces. Items whose quantity is zero should be omitted. The items should be listed in alphabetical order.
Then there should be a line with the total weight and value the robbers will be taking, separated by commas with no spaces.
Finally, there should be a line with the text "Each robber gets: <value>", where <value> should be replaced by the total value divided by the number of robbers, expressed to two decimal digits by rounding.

For example, the output corresponding to the above input is:

Gold,9,310500,27000
Silver,3,44400,3600
354900,30600
Each robber gets: 6120.00 

Solution

 Of course, we could try all possible combinations of items. However, we should expect this approach to time-out as the number of combinations can be very high.
The original problem  statement gives us a clue as to how to solve the problem. It says the when the robbers are considering which items to carry they start by the ones that have a higher value to weight ratio.
In the above example, gold has a value/weight of (3000/34500) = 0.08696, followed by silver with 0.08108 and least valuable is bronze, with 0.00276.
The robbers could then take 10 units of gold, except this would not be an optimal solution. If the took 10 units of gold, each weighting 345000, they would have some remaining load capacity they would be unable to use. If they take one less unit of gold, they can fit 3 units of silver, and 3 silver units are move valuable than 1 gold.
We can now outline our solution.

Outline of the solution

The first approach is as follows:
  1. Start by sorting the different types of items according to their value/weight ratio.
  2. Then start by calculating how many units you can take of the most valuable (by weigth) item.
  3. Subtract the weight of these items from your capacity.
  4. Repeat the 2 and 3 with the rest of the items.
This procedure has linear complexity, but does not produce optimal results.
As we mention previously, it is not always the best idea to take all we can of certain item. We should add an additional step, 2b, in which we bifurcate: instead of trying to take only the maximum number of items, we try with all possible values.
This modification gives our algorithm exponential complexity. We are essentially trying an exhaustive search. Of course this will time-out. ¿How can we solve it? We have to accept some sub-optimality and pray that it passes the test cases. In the code below, we try taking from the maximum number of elements, to the maximum number of elements minus 3.

Code


We can define our almost-optimal algorithm recursively:

# The Heist, by Juan I. Carrano | IEEEXtreme 8.0 (2014)
N, M = [int(k) for k in raw_input().split()]

items = {}

while True:
    ss = raw_input().strip()
    if ss == 'END':
        break
    name,sload,sprice = ss.split(',')
    items[name.strip()] = (int(sload), int(sprice))

items_scored = sorted([(float(p)/l, k) for k,(l,p) in items.iteritems()], reverse=True)


def choose(cap_remaining, xitems_scored):
    if not xitems_scored:
        return (0, [])
    else:
        score,item = xitems_scored[0]
        alternatives = []
        l, p = items[item]
        to_load_max = cap_remaining/l

        for to_load in xrange(max(0, to_load_max-3), to_load_max+1):
            this_item = [[item, to_load]]
            this_money = to_load*p

            best_fill_price, best_fill_items = choose(cap_remaining - to_load*l, xitems_scored[1:])

            alternatives.append((this_money + best_fill_price,
                                this_item + best_fill_items))

        return max(alternatives)

total_money, best_solution_items  = choose(N*M, items_scored)

total_weight =0

for it, amount in sorted(best_solution_items):
    l = amount*items[it][0]
    p = amount*items[it][1]
    total_weight += l
    if (amount > 0):
        print "%s,%d,%d,%d"%(it, amount, l, p)

print "%d,%d"%(total_weight, total_money)
print "Each robber gets: %.2f"%(round(100*total_money/float(N))/100.0)

Let's explain  it:

 

Lines 2 to 11

We parse the input. Each type of item is placed in a dictionary (called items) whose keys are the items' names and whose values are tuples of (unit weight, unit value).

 

Line 13:

We create a list with the items sorted from the most valuable to the cheapest, in terms of their price/weight ratio (let's call this the score).

 

Lines 16 to 34, the recursive function choose

This function explores a decision tree in search for the best combination of items. It takes as arguments the capacity available and a list of (item score, item name), sorted so that items with best score come first. It returns the best way to fill up that capacity with the given items, along with the total value of such items.
While it is generally a bad idea to implement a tree search with a recursive function (we are using the program stack as the backtracking stack), it is easier and the problem states that there won't be more than 5 item types, so the recursion depth is limited to a fairly small number.
Let's see how choose works. The terminal case is when there are no items to add. The best and only way to fill up the capacity is to take no items, so we return zero (the price of no items) and an empty list.
If there are items in the xscored_items list we take the first one (the one with the best price/weight ratio) and calculate the maximum amount of units we can take (to_load_max) as the quotient of the capacity divided by the item's unit weight. We then evaluate the possibilities of taking different amounts of units. If this were a completely exhaustive search, we should try taking from 0 to to_load_max and see which one produces the best price. Here we try only with 3 possibilities, greatly reducing the run time at the cost of potential sub-optimality.
The procedure to evaluate a possibility is as follows:
  • Line 27: Calculate the price of the units we are taking (to_load*p).
  • Line 29: Calculate the weight of the units we are taking (to_load*l). Substract this weight from the current load capacity. Call the function choose with this reduced load capacity and a list of items that excludes the one we are currently evaluating (pass on xitems_scored from the second item onwards). choose should return the best way to fill up the remaining space, and the price of such fill.
  • Line 31: The price of the current possibility is the price of the units we are taking plus the price if the best fill up. We also create a list by combining the items returned by choose in line 29 with the possibility we are evaluating.
In line 34 we return the best of all the possibilities we evaluated (the one with the greatest price). This is why we say that choose returns the best way to fill up the available capacity.

 

Lines 38 to 48

This part of the code formats the output according to the requirements of the problem.

 

Files


This code, along with test cases are available at this Drive folder.