viernes, 21 de junio de 2024

Reg cookbook

Here's how you can correctly match the first word in a string:

  1. import re
  2.  
  3. text = "subject, adjust, jump, university, major"
  4.  
  5. # Match the first word in the string
  6. match = re.match(r"^\w+", text, flags=re.IGNORECASE)
  7.  
  8. if match:
  9. print(match.group())
  10. else:
  11. print("No match found")



To match the last word 

in a string, you can use regular expressions combined with string manipulation techniques. The approach varies slightly depending on whether you use re.findall, re.search, or another method.

Using re.findall

To find the last word using re.findall, you would typically capture all words and then select the last one:

  1. import re
  2.  
  3. text = "subject, adjust, jump, university, major"
  4.  
  5. # Find all words in the string
  6. words = re.findall(r'\w+', text, flags=re.IGNORECASE)
  7.  
  8. # Get the last word
  9. last_word = words[-1] if words else None
  10.  
  11. print(last_word)
  12.  

 Match Words Starting with "gob"


  1. import re
  2.  
  3. text = "goblin, goblet, gobsmacked, gobble, dog, gob"
  4.  
  5. # Find words starting with 'gob'
  6. matches = re.findall(r'\bgob\w*', text, flags=re.IGNORECASE)
  7.  
  8. print(matches)

Explanation

  • \bgob\w*:
    • \b asserts a word boundary before "gob".
    • gob is the specific prefix we are looking for.
    • \w* matches zero or more word characters following "gob".


3. Match Words Ending with "te"

  1. import re
  2.  
  3. text = "complete, update, bite, great, late, state"
  4.  
  5. # Find words ending with 'te'
  6. matches = re.findall(r'\b\w*te\b', text.lower())
  7.  
  8. print(matches)
  9.  

    Explanation

    • \b\w*te\b:
      • \b asserts a word boundary.
      • \w* matches zero or more word characters preceding "te".
      • te is the suffix we are looking for.
      • \b asserts a word boundary to ensure "te" is at the end of the word.



The matches list will contain all words from the text that have the substring "uj" in them. For the provided text, the output will be:

  1. import re
  2.  
  3. text = "subject, adjust, jump, university, major" # Find words containing 'uj' (case-insensitive)
  4.  
  5. matches = re.findall(r'\b\w*uj\w*\b', text, flags=re.IGNORECASE)
  6.  
  7. print(matches)
  8.  
Explanation
  • r'\b\w*uj\w*\b':

    • \b is a word boundary anchor, which ensures that the match occurs at the beginning or end of a word. It's useful if you want to match whole words but is optional if you're just looking for substrings within words.
    • \w* matches any number of word characters (letters, digits, and underscores) before and after the substring uj.
    • uj is the substring you're looking to match within the words.
  • Flags: re.IGNORECASE makes the search case-insensitive.



Match emails
matches = re.findall(r'\b[\w.-]+@[a-zA-Z-]+\.[a-zA-Z.]{2,6}\b', text)

return  string after @
import re 
 text = "Email me at john.doe@example.com or jane_smith123@test.co.uk"
matches = re.findall(r'@(\w+)', text) print(matches)

return before

matches = re.findall(r'(\w+)@', text) 





////////////// todo lo que empiece con a  y seguido de uno o mas caracteres y termine  en r
import re
pattern = r"^a.+r$"   
text1 = "ar"
text2 = "abr"

print(re.findall(pattern, text1))  # No Match
print(re.findall(pattern, text2))  # Match

#^a   // todo lo que empiece con a

#.+   Uno o mas caracteres si quito el signo de mas  solo  podria tener un solo caracter para machar

#r$  // todo lo que termine con r

#final machea todo lo que empiece con a seguido de uno mas caractereres y termine en r   ejemplo machea  abr pero no machea ar


Search  XXX-XXX-XXX phone format

import re
text="ambiorix rodriguez 809-714-2819 809-560-8344 829-561-3454 edad 42"
match=re.findall(r'\d{3}-+\d{3}-\d{4}',text)
print(match)

['809-714-3489', '809-560-8344', '829-561-3454']


No hay comentarios:

Publicar un comentario