javascript - Programming tips with Japanese Language/Characters -


I have to write to help me for some web applications, and maybe other, learning Japanese is better because I'm studying Am language

My problem is that the site will be mostly in English, so it is necessary to mix Japanese characters with diamonds and katakana generally, but later Kanji. I am getting closer to fulfilling it; I have thought that pages and source files need Unicode and UTF-8 content types.

However, my problem comes in actual coding. I have an example that is required to manipulate the strings of the text which are:

け す I need to take that action and change it into te-form け し . I like to do it in javascript because it will help manipulate the road more, but if I have to call DB and hold everything in a DB.

My question is not just how to do it in JavaScript, but what are some tips and strategies for doing these types of things in other languages. I'm hoping, but when it comes, I'm lost.

My question is how to do it only in javascript, but these types of What are some tips and strategies for doing things in other languages ​​also?

What you want to do is very basic string manipulation - in addition to Berry Notes, except for the missing word separator, although this is not a technical problem.

Actually, there is no real difference between Japanese Kana or Kanji for a modern Unicode-aware programming language (which is since Javascript 1.3 version), and a Latin letter - they All are just letters and a string is simply, well, a string of characters.

When it becomes difficult, then you have to convert between strings and bytes, because then you have to pay attention to the encoding that you are using. Unfortunately, many programmers, especially native English speakers, are glowing on this problem because ASCII is the standard standard encoding for Latin characters and other encodings usually try to be consistent. Latin letters are all you need, so you can get acquainted with the character encoding about being ignorant, believing that the bytes and characters are basically the same thing - and the programs are whatever Also write that the ASCII is not crippled

then. The "secret" of Unicode-aware programming is this: To know when and where the strings / letters are converted into bytes, and make sure the correct encoding is used in all these places, i.e. the same Will be used for reverse conversion and one that can encode all the characters you are using. UTF-8 is gradually becoming the standard D-Facto and should be used wherever you prefer.

Specific example (non-full):

  • While typing the source
  • Such source code (compiler / interpreter needs to know the encoding ) When compiling or interpreting
  • When reading / writing a string in a file (the encoding should be specified in the API, or in the file's metadata)
  • (Encoding should be specified in DB or configuration of the table)
  • While distributing HTML pages through the webserver (encoding should be specified in the HTML headings or the page's meta header; the form can also be more difficult)

Comments

Popular posts from this blog

python - Overriding the save method in Django ModelForm -

html - CSS autoheight, but fit content to height of div -

qt - How to prevent QAudioInput from automatically boosting the master volume to 100%? -