The charming adventures of an analyst and his solver.
Python obfuscation and Huffman Coding.
At this point, tricking python into printing strings via ever more pernicious mechanisms got a little boring. So I switched to obfuscating fundamental computer science algorithms. Here's a JAPH that takes in a Huffman coded version of 'just another python hacker', decodes, and prints it.
# Build coding tree
def build_tree(scheme):
if scheme.startswith('*'):
left, scheme = build_tree(scheme[1:])
right, scheme = build_tree(scheme)
return (left, right), scheme
else:
return scheme[0], scheme[1:]
def decode(tree, encoded):
ret = ''
node = tree
for direction in encoded:
if direction == '0':
node = node[0]
else:
node = node[1]
if isinstance(node, str):
ret += node
node = tree
return ret
tree = build_tree('*****ju*sp*er***yct* h**ka*no')[0]
print(
decode(tree, bin(10627344201836243859174935587).lstrip('0b').zfill(103))
)
The decoding tree is built like a true LISP-style tree as a sequence of pairs. '*' represents a branch in the tree while other characters are leaf nodes. This looks like ((((('j', 'u'), ('s', 'p')), ('e', 'r')), ((('y', 'c'), 't'), (' ', 'h'))), (('k', 'a'), ('n', 'o'))) after it's constructed.
The actual Huffman coded version of our favorite string looks like 0000000001000100101011010111011101010111001000110110000110100001010111111110011001111010100110000100011, which in base-2 encoding represents around a 50% savings in space. Well worth all the effort, right?
There's a catch here, which is that this is hard to obfuscate unless we turn it into a single expression. This means that we have to convert build_tree and decode into lambda functions. Unfortunately, they are recursive and lambda functions don't do that easily. Fortunately, we can use Y combinators and then the rest is simple. These are worth some study since they will pop up again later.
Y = lambda g: (
lambda f: g(lambda arg: f(f)(arg))) (lambda f: g(lambda arg: f(f)(arg))
)
build_tree = Y(
lambda f: lambda scheme: (
(f(scheme[1:])[0], f(f(scheme[1:])[1])[0]),
f(f(scheme[1:])[1])[1 ]
) if scheme.startswith('*') else (scheme[0], scheme[1:])
)
decode = Y(lambda f: lambda x: x[3]+x[1] if not x[2] else (
f([x[0], x[0], x[2], x[3]+x[1]]) if isinstance(x[1], str) else (
f([x[0], x[1][0], x[2][1:], x[3]]) if x[2][0] == '0' else (
f([x[0], x[1][1], x[2][1:], x[3]])
)
)
))
tree = build_tree('*****ju*sp*er***yct* h**ka*no')[0]
print(
decode([
tree,
tree,
bin(10627344201836243859174935587).lstrip('0b').zfill(103), ''
])
)
The final version is really just a condensed (and expanded, weirdly) version of the above (again, make sure to use Python 3.2):
print((lambda t,e,s:(lambda g:(lambda f:g(lambda arg:f(f)(arg)))(lambda f:
g(lambda arg: f(f)(arg))))(lambda f:lambda x: x[3]+x[1]if not x[2]else f([
x[0],x[0],x[2],x[3]+x[1]])if isinstance(x[1],str)else f([x[0],x[1][0],x[2]
[1:],x[3]])if x[2][0]=='0'else f([x[0],x[1][1],x[2][1:],x[3]]))([t,t,e,s])
)((lambda g:(lambda f:g(lambda arg:f(f)(arg)))(lambda f:g(lambda arg:f(f)(
arg))))(lambda f:lambda p:((f(p[1:])[0],f(f(p[1:])[1])[0]),f(f(p[1:])[1])[
1])if p.startswith('*')else(p[0],p[1:]))('*****ju*sp*er***yct* h**ka*no')[
0],bin(10627344201836243859179756385-4820798).lstrip('0b').zfill(103),''))