In the Ashton String task, the goal is to:
Arrange all the distinct substrings of a given string in
lexicographical order and concatenate them. Print the Kth character of
the concatenated string. It is assured that given value of K will be
valid i.e. there will be a Kth character.
First line will contain a number T i.e. number of test cases. First
line of each test case will contain a string containing characters
(a−z) and second line will contain a number K.
Print Kth character ( the string is 1 indexed )
1 ≤ T ≤ 5
1 ≤ length ≤ 105
K will be an appropriate integer.
For example, given the input:
1 dbac 3
The output would be:
I’ve attempted the task with this code and it works for relatively short strings:
from itertools import chain def zipngram(text, n=2): words = list(text) return zip(*[words[i:] for i in range(n)]) for _ in input(): t = input() position = int(input())-1 # 0th indexing chargrams = chain(*[zipngram(t,i) for i in range(1,len(t)+1)]) concatstr = ''.join(sorted([''.join(s) for s in chargrams])) print (concatstr[position])
But if the input file looks like this: http://pastebin.com/raw/WEi2p09H and the desired output is:
l s y h s
The interpreter will throw a
Traceback (most recent call last): File "solution.py", line 11, in <module> chargrams = chain(*[zipngram(t,i) for i in range(1,len(t)+1)]) File "solution.py", line 11, in <listcomp> chargrams = chain(*[zipngram(t,i) for i in range(1,len(t)+1)]) File "solution.py", line 6, in zipngram return zip(*[words[i:] for i in range(n)]) File "solution.py", line 6, in <listcomp> return zip(*[words[i:] for i in range(n)]) MemoryError
How can the MemoryError be resolved? Is it solvable in another way using native python2 or python3?
I tried to resolve the
MemoryError by pruning the stack using
heapq but now it goes into an uber slow runtime pushing and popping the heap instead of taking up too much memory.
from itertools import chain import heapq t = int(input()) s = list(input()) k = int(input()) stack =  for n in range(1,len(s)+1): for j in range(n): ngram = (''.join(s[j:])) ngram_len = len(ngram) # Push the ngram into the heapq. heapq.heappush(stack, ngram) pruned_stack =  temp_k = 0 # Prune stack. while stack !=  and temp_k < k: x = heapq.heappop(stack) pruned_stack.append(x) temp_k+=len(x) # Replace stack with pruend_stack. stack = pruned_stack print (''.join(chain(*pruned_stack))[k])
Is there a way to balance between not using too much memory that it leads to
**MemoryError** and too slow runtime with
**heapq** pushing and popping?
MemoryError means that the program consumed all available memory and thus crashed.
A possible solution is using iterables (they work in Py2 too but Py3 has better support with them) that are lazy (they compute the value only on demand, not all at once).
Adapting your program to generators should need only minor changes, to index a generator without using a list (that would nullify the lazy benefit) see: Get the nth item of a generator in Python