Comments (4)
@batter You can probably do it using something like this (not tested on 1.8.7, but works on 2.1 and doesn't use any post-1.8 features as far as I know):
s.unpack("U*").map { |c| c > 0x7F ? "&##{c};" : [c].pack("c") }.join("")
from htmlentities.
It can't do it. That's a bit of an ill-defined problem, though. The good news is that if you're doing what I think you're doing, there may be a very simple solution.
If you just want to make your HTML ASCII-safe, by encoding everything above 0x7F, that's really easy (in Ruby 1.9):
s = "<em>日本語</em>"
s.gsub(/\P{ASCII}/){ "&##{$&.unpack('U').first};" }
# => "<em>日本語</em>"
from htmlentities.
@threedaymonk - Nice gem. Sorry for asking, but do you know if there's a way to accomplish the same thing you did with your code sample above in Ruby 1.8.7? The Regexp
class doesn't have all the nice character classes and properties in 1.8.7 like 1.9 does.
We have an application currently stuck on Ruby18 and we're working towards moving it to Ruby19 or higher but in the meantime I'm trying to determine what the ideal solution for encoding characters (but omitting HTML tags) is, and finding it difficult to work with encoded characters as a whole when compared to using Ruby19+. Any advice would be appreciated!
from htmlentities.
@threedaymonk - Thanks a bunch! I'm going to do some more extensive testing but that seems to do the trick and it isn't raising any errors on 1.8.7. Any tips or pointers on documentation I can read to try to become a little more well versed with working with encodings in Ruby?
from htmlentities.
Related Issues (20)
- Cannot Decode , HTML to Comma HOT 6
- Add License information to gemfile HOT 8
- NameError: uninitialized constant HTMLEntities::Encoder::Encoding HOT 5
- Option to exlude some characters from being decoded HOT 8
- Verify HTML entity names HOT 4
- decode fails on html_safe strings HOT 2
- Remove http://htmlentities.rubyforge.org/ link in the description on GitHub HOT 1
- Encode Registered Trademark (®) HOT 1
- Expanded encoder doesn't encode colon character HOT 2
- doesn't decode &Amp; - purposeful? HOT 1
- Decode of TM symbol inconsistent between entity name and code HOT 1
- expanded.rb - warning: key "inodot" is duplicated and overwritten on line 466 HOT 4
- Using this with Controller HOT 1
- Encoding for "μ" does not seem to work HOT 6
- Typo in files: "subE" is ⫅, not ⊆ HOT 1
- Improperly decoding apostrophe HOT 2
- "\xE2" from ASCII-8BIT to UTF-8 HOT 1
- Add support for case-insentitive decoding
- Add support for incorrect numerical entity format
- Add support for HTML5 entities (specifically, ≈)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from htmlentities.