Split string with multiple delimiters in Python<>
This question already has an answer here:
Luckily, Python has this built-in :)
import re re.split('; |, ',str)
Update:Following your comment:
>>> a='Beautiful, is; better*than\nugly' >>> import re >>> re.split('; |, |\*|\n',a) ['Beautiful', 'is', 'better', 'than', 'ugly']
Do a str.replace('; ', ', ') and then a str.split(', ')
Here's a safe way for any iterable of delimiters, using regular expressions:
>>> import re >>> delimiters = "a", "...", "(c)" >>> example = "stackoverflow (c) is awesome... isn't it?" >>> regexPattern = '|'.join(map(re.escape, delimiters)) >>> regexPattern 'a|\\.\\.\\.|\\(c\\)' >>> re.split(regexPattern, example) ['st', 'ckoverflow ', ' is ', 'wesome', " isn't it?"]
re.escape allows to build the pattern automatically and have the delimiters escaped nicely.
Here's this solution as a function for your copy-pasting pleasure:
def split(delimiters, string, maxsplit=0): import re regexPattern = '|'.join(map(re.escape, delimiters)) return re.split(regexPattern, string, maxsplit)
If you're going to split often using the same delimiters, compile your regular expression beforehand like described and use RegexObject.split.
In response to Jonathan's answer above, this only seems to work for certain delimiters. For example:
>>> a='Beautiful, is; better*than\nugly' >>> import re >>> re.split('; |, |\*|\n',a) ['Beautiful', 'is', 'better', 'than', 'ugly'] >>> b='1999-05-03 10:37:00' >>> re.split('- :', b) ['1999-05-03 10:37:00']
By putting the delimiters in square brackets it seems to work more effectively.
>>> re.split('[- :]', b) ['1999', '05', '03', '10', '37', '00']
This is how the regex look like:
import re # "semicolon or (a comma followed by a space)" pattern = re.compile(r";|, ") # "(semicolon or a comma) followed by a space" pattern = re.compile(r"[;,] ") print pattern.split(text)