473,324 Members | 1,856 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

How to deal with all of those MS Word Funky characters

Put simply, I have a text box, and people commonly cut + paste
information into this text box from Microsoft word. The problem is that
word has all types of funky characters (smart quotes, em-dashes), that
the system (php-based) doesn't understand. Does anyone know of a way to
filter out these Microsoft-specific characters? Does PHP have a special
function for this? Thanks a lot!

Aug 19 '05 #1
1 4004
ge********@gmail.com wrote:
Put simply, I have a text box, and people commonly cut + paste
information into this text box from Microsoft word. The problem is that
word has all types of funky characters (smart quotes, em-dashes), that
the system (php-based) doesn't understand. Does anyone know of a way to
filter out these Microsoft-specific characters? Does PHP have a special
function for this? Thanks a lot!


Hooray I can actually be of use to this group for once. Yes, if you look
in the user notes on php.net for the htmlentities function you will see
an entry from mail at britlinks dot com (19-May-2004 05:27). I've listed
it below for reference. Mind you I'm sure the hardcore programmers on
this group will be able to formulate a one-line regexp for this and we
look forward to seeing it.

In the meantime, I hope this helps.
<?php
// strips slashes, and converts special characters to HTML equivalents
for string defined in $var
function htmlfriendly($var,$nl2br = false){
$chars = array(
128 => '€',
130 => '‚',
131 => 'ƒ',
132 => '„',
133 => '…',
134 => '†',
135 => '‡',
136 => 'ˆ',
137 => '‰',
138 => 'Š',
139 => '‹',
140 => 'Œ',
142 => 'Ž',
145 => '‘',
146 => '’',
147 => '“',
148 => '”',
149 => '•',
150 => '–',
151 => '—',
152 => '˜',
153 => '™',
154 => 'š',
155 => '›',
156 => 'œ',
158 => 'ž',
159 => 'Ÿ');
$var = str_replace(array_map('chr', array_keys($chars)), $chars,
htmlentities(stripslashes($var)));
if($nl2br){
return nl2br($var);
} else {
return $var;
}
}
?>
Aug 19 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Scot Hacker | last post by:
I have a database that has been populated with content pasted out of MS Word, and is full of special characters -- em dashes, curly quotes, curly apostrophes, etc. Now I'm generating plain text...
3
by: gdogg1587 | last post by:
Greetings. I'm working on a program that will "descramble" words. Think of a word scramble game where there is a list of characters, and several blank spaces to denote the word(s) that you are...
1
by: Markus Ernst | last post by:
Hi I wrote a function that "normalizes" strings for use in URLs in a UTF-8 encoded content administration application. After having removed the accents from latin characters I try to remove all...
3
by: turnitup | last post by:
Dear all, I have a problem with a form, and I have tried various permutations of htmlentities() and html_entity_decode() to resolve, but without success. Here is the workflow. 1: User...
12
by: comp.lang.php | last post by:
I have a textarea where people can cut & paste their resume. Unfortunately they often cut & paste their Word resume into the textarea, funky characters and all. This causes the display to be...
48
by: Frederick Gotham | last post by:
The "toupper" function takes an int as an argument. That's not too irrational given that a character literal is of type "int" in C. (Although why it isn't of type "char" escapes me... ) The...
0
by: CtrlAltDel | last post by:
This isn't a PHP issue per say, but I guess a question to other PHP developers. I've just recently converted everything I got to Unicode UTF-8 which introduced me to the "problems" dealing with...
2
by: Ola K | last post by:
Hi guys, I wrote a script that works *almost* perfectly, and this lack of perfection simply puzzles me. I simply cannot point the whys, so any help on it will be appreciated. I paste it all here,...
89
by: Tubular Technician | last post by:
Hello, World! Reading this group for some time I came to the conclusion that people here are split into several fractions regarding size_t, including, but not limited to, * size_t is the...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.