Automatically shorten long strings when dumping with pretty print

I have the following test program:

    from random import choice
    d = { }
    def data(length):
        alphabet = 'abcdefghijklmnopqrstuvwxyz'
        res = ''
        for _ in xrange(length):
            res += choice(alphabet)
        return res
    # Create the test data
    for cnt in xrange(10):
        key = 'key-%d' % (cnt)
        d[key] = data(30)
    def pprint_shorted(d, max_length):
        import pprint
        pp = pprint.PrettyPrinter(indent=4)
    pprint_shorted(d, 10)

Currently the output is something like:

{   'key-0': 'brnneqgetvanmggyayppxevwcnxvue',
    'key-1': 'qjzrklrdkykililenwcyhaexuylgub',
    'key-2': 'ayddiaxhvgxpszutnjdwlgojqaluhr',
    'key-3': 'rmjpzxrmbogezorigkycqhpsctinzq',
    'key-4': 'botfczymszkzwuiecyarknnrvwavnr',
    'key-5': 'norifblhtvfnwblcyeipjmteznylfy',
    'key-6': 'tiiubgdwxnogdmbafvnujbwpfdopjl',
    'key-7': 'badgwbrrqunivylutbxqkaeuctrykt',
    'key-8': 'wulrfkqfqqecxmscayzdbatyispwtu',
    'key-9': 'gzlwfvjrevlyvbmrvuisnyhhbbwtdd'}

In my production data, sometimes the strings are really long (several thousand chars, coming from base64 encoded attachments for example), and I do not want that filling up my logs. I would like something like:

{   'key-0': 'brnneqgetv...',
    'key-1': 'qjzrklrdky...',
    'key-2': 'ayddiaxhvg...',
    'key-3': 'rmjpzxrmbo...',
    'key-4': 'botfczymsz...',
    'key-5': 'norifblhtv...',
    'key-6': 'tiiubgdwxn...',
    'key-7': 'badgwbrrqu...',
    'key-8': 'wulrfkqfqq...',
    'key-9': 'gzlwfvjrev...'}

That is, the string values in the dict with length > max_length must be replaced by ellipsis. Is there any build-in support in pretty print for this, or must I create a copy of the dict by manually walking it and shortening the strings myself?

Best answer

You can subclass the PrettyPrinter and override the method _format:

import pprint

class P(pprint.PrettyPrinter):
  def _format(self, object, *args, **kwargs):
    if isinstance(object, basestring):
      if len(object) > 20:
        object = object[:20] + '...'
    return pprint.PrettyPrinter._format(self, object, *args, **kwargs)

P().pprint('x' * 1000)

This prints:

[0, 1, 2]