Here's how you can correctly match the first word in a string:
- import re
-
- text = "subject, adjust, jump, university, major"
-
- # Match the first word in the string
- match = re.match(r"^\w+", text, flags=re.IGNORECASE)
-
- if match:
- print(match.group())
- else:
- print("No match found")
To match the last word
in a string, you can use regular expressions combined with string manipulation techniques. The approach varies slightly depending on whether you use re.findall
, re.search
, or another method.
Using re.findall
To find the last word using re.findall
, you would typically capture all words and then select the last one:
- import re
- text = "subject, adjust, jump, university, major"
- # Find all words in the string
- words = re.findall(r'\w+', text, flags=re.IGNORECASE)
- # Get the last word
- last_word = words[-1] if words else None
- print(last_word)
Match Words Starting with "gob"
- import re
- text = "goblin, goblet, gobsmacked, gobble, dog, gob"
- # Find words starting with 'gob'
- matches = re.findall(r'\bgob\w*', text, flags=re.IGNORECASE)
- print(matches)
Explanation
\bgob\w*
:\b
asserts a word boundary before "gob".gob
is the specific prefix we are looking for.\w*
matches zero or more word characters following "gob".
3. Match Words Ending with "te"
- import re
- text = "complete, update, bite, great, late, state"
- # Find words ending with 'te'
- matches = re.findall(r'\b\w*te\b', text.lower())
- print(matches)
Explanation
\b\w*te\b
:\b
asserts a word boundary.\w*
matches zero or more word characters preceding "te".te
is the suffix we are looking for.\b
asserts a word boundary to ensure "te" is at the end of the word.
The matches
list will contain all words from the text
that have the substring "uj" in them. For the provided text, the output will be:
- import re
- text = "subject, adjust, jump, university, major" # Find words containing 'uj' (case-insensitive)
- matches = re.findall(r'\b\w*uj\w*\b', text, flags=re.IGNORECASE)
- print(matches)
r'\b\w*uj\w*\b'
:\b
is a word boundary anchor, which ensures that the match occurs at the beginning or end of a word. It's useful if you want to match whole words but is optional if you're just looking for substrings within words.\w*
matches any number of word characters (letters, digits, and underscores) before and after the substringuj
.uj
is the substring you're looking to match within the words.
Flags:
re.IGNORECASE
makes the search case-insensitive.