4👍
The formatting of email replies depend on the clients. There is no realiable way to extract the newest message without the risk of removing too much or not enough.
However, a common way to mark quotes is by prefixing them with >
so lines starting with that character – especially if there are multiple at the very end or beginning of the email – are likely to be quotes.
But the On Thu, Mar 24, 2011 at 3:51 PM, <test@test.com> wrote:
from your example is hard to extract. A line ending with a :
right before a quote might indicate that it belongs to the quote, you cannot know that for sure – it could also be part of the new message and the colon is just a typo’d .
(on german keyboards :
is SHIFT+.
).
2👍
The answer @LAMRIN TAWSRAS gave will work for parsing the text before the Gmail date expression only if a match is found, otherwise an error will be thrown. Also, there isn’t a need to search the entire message for multiple date expressions, you just need the first one found. Therefore, I would refine his solution to use re.search()
:
def get_body_before_gmail_reply_date(msg):
body_before_gmail_reply = msg
# regex for date format like "On Thu, Mar 24, 2011 at 3:51 PM"
matching_string_obj = re.search(r"\w+\s+\w+[,]\s+\w+\s+\d+[,]\s+\d+\s+\w+\s+\d+[:]\d+\s+\w+.*", msg)
if matching_string_obj:
# split on that match, group() returns full matched string
body_before_gmail_reply_list = msg.split(matching_string_obj.group())
# string before the regex match, so the body of the email
body_before_gmail_reply = body_before_gmail_reply_list[0]
return body_before_gmail_reply
- ModuleNotFoundError: No module named 'import_export'
- How to get average from set of objects in Django?
- Listing Related Fields in Django ModelAdmin
- Override default Django translations
- TemplateSyntaxError: 'with' expected with atleast one variable assignment
1👍
I think this should work
import re
string_list = re.findall(r"\w+\s+\w+[,]\s+\w+\s+\d+[,]\s+\d+\s+\w+\s+\d+[:]\d+\s+\w+.*", strings) # regex for On Thu, Mar 24, 2011 at 3:51 PM
res = strings.split(string_list[0]) # split on that match
print(res[0]) # get before string of the regex
- There is no South database module 'south.db.postgresql_psycopg2' for your database
- How to reload new update in Django project with Apache, mod_wsgi?
- Use Django ORM outside of Django
- Django 'ascii' codec can't encode character
0👍
Try this:
import re
def deleteForwardedMessagesFromMessage(message):
nextMessage = re.split(r"\n.*[\,].*\<\s*.*>", message)[0]
print(nextMessage)
return nextMessage
- Best practice of testing django-rq ( python-rq ) in Django
- Write a wrapper to expose existing REST APIs as SOAP web services?
0👍
Try this it works for french/english emails :
On Thu, Mar 24, 2011 at 3:51 PM, test@test.com wrote:
Le mer. 28 avr. 2021 à 10:03, test.test@orange.com a écrit :
regex=r’\w+\s+\w+[,.]\s+(\w+\s+\d+|\d+\s+\w+)[,.]\s+\d+\s+(\w+|\à)\s+\d+[:]\d+(\s+\w+)?,?\s+(\s*<?\s*[a-zA-Z][._-]?[a-zA-Z][@]\w+[.]?\w{2,3}\s*>?;?\s*)(a écrit|wrote)\s*:’
- How to fetch data server-side in the latest Next.js? Tried getStaticProps but it's not running and getting undefined
- List_editable and widgets
- How to disable request logging in Django and uWSGI?
- Django – Media upload [Errno 13] Permission denied
- Cache a django view that has URL parameters