tutes-dump/site-tutorials/UTF-8.html

66 lines
4.6 KiB
HTML
Raw Normal View History

2020-07-11 06:11:19 -04:00
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8"/>
</head>
<body>
<h1 id="localizationandyou:utf-8onnetbsd">Localization and You: UTF&#8211;8 on NetBSD</h1>
<p>NetBSD is a great little operating system, but it&#8217;s a much smaller project than Linux. This means there isn&#8217;t as much call for better internationalization support, as most of the users and developers are perfectly comfortable with ASCII or the ISO&#8211;8859&#8211;1 western European locale. This can cause some problems when using software that expects Unicode, also known as UTF&#8211;8, also known as the one true text encoding for the future. Here&#8217;s how to fix it. These instructions assume you&#8217;re using a bourne-compatible shell like ksh, bash, or zsh. If you&#8217;re using (t)csh you&#8217;re on your own.</p>
<h2 id="environmentvariables">Environment Variables</h2>
<p>Most of the time, you can &#8220;fake&#8221; proper UTF&#8211;8 support by exporting three environment variables and leaving it up to your local terminal emulator to handle the rest. Add the following three lines to your ~/.profile :</p>
<pre><code>export LANG=&quot;en_US.UTF-8&quot;
export LC_CTYPE=&quot;en_US.UTF-8&quot;
export LC_ALL=&quot;en_US.UTF-8&quot;
</code></pre>
<p>Save, kill any screen or tmux sessions or other background processes, and log out. When you log in again, you should have a proper UTF&#8211;8 terminal as far as most programs are concerned. </p>
<p>Perl will throw the following error when invoked: </p>
<pre><code>perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LC_ALL = &quot;en_US.UTF-8&quot;,
LC_CTYPE = &quot;en_US.UTF-8&quot;,
LANG = &quot;en_US.UTF-8&quot;
are supported and installed on your system.
perl: warning: Falling back to the standard locale (&quot;C&quot;).
</code></pre>
<p>Feel free to ignore this error. As long as you&#8217;ve got those environment variables set, you should be fine.</p>
<p>Python 3 expects all source files to be UTF&#8211;8 text, so please make sure to change these things before working on python3 code.</p>
<h2 id="rxvt-unicode">Rxvt-Unicode</h2>
<p>Rxvt-Unicode, urxvt, rxvt-unicode&#8211;256color. By whatever name you call it, it&#8217;s a very popular terminal among Linux and *BSD &#8220;power users.&#8221; Unfortunately, using urxvt adds an extra degree of difficulty to connecting to SDF - there&#8217;s no <code>$TERM</code> setting that corresponds with it! I&#8217;m sure some of you have tried logging in to SDF from urxvt, only to have scary warnings printed to stderr and have everything treated like a dumb paper teletype. Don&#8217;t worry, there&#8217;s a very simple fix for that as well. Open up ~/.profile again and add these lines:</p>
<pre><code>if [ $TERM == &quot;rxvt-unicode&quot; ] || [ $TERM == &quot;rxvt-unicode-256color&quot; ]; then
export TERM=&quot;rxvt&quot;
fi
</code></pre>
<p>In simple terms, this tricks NetBSD into thinking your terminal is rxvt, the original program urxvt is based on. If you have a MetaArpa account, don&#8217;t worry - the MetaArray is running CentOS, which understands urxvt just fine. </p>
<h2 id="escapecharacters">Escape Characters</h2>
<p>NetBSD&#8217;s terminal has what are called &#8220;escape characters.&#8221; These are characters in the &#8220;high ASCII&#8221; (decimal 129&#8211;255) range that manipulate the shell session when read from stdin or written to stdout. As you might imagine, this screws with programs that write large amounts of arbitrary characters to standard output, like the &#8220;kermit -s&#8221; or &#8220;sz&#8221; file transfer programs. For sx/sy/sz (X/Y/ZMODEM protocols) your best bet is to just not use them with SDF for now. If you&#8217;re on a TCP/IP connection (which most of you are) it&#8217;s easier to stick with scp/sftp for secure transfers, and http or ftp for insecure. If you really need &#8220;in-line&#8221; file transfer, there is a way to make &#8220;kermit -s&#8221; work around NetBSD&#8217;s escape characters. This is adding the <code>-8</code> and <code>-0</code> flags. If I wanted to transfer the SQLite database &#8220;winning-lottery-numbers.sqlite&#8221; from SDF to my home machine, I would do it like this:</p>
<pre><code>tidux@sdf:~$ kermit -s -8 -0 winning-lottery-numbers.sqlite
</code></pre>
<p>Then my local kermit program would receive the transfer and I could continue working on SDF as usual. If you do this often, it may be wise to add an alias in your shell configuration files, like so:</p>
<pre><code>alias send='kermit -s -8 -0'
</code></pre>
<hr />
<p>I hope this guide has been helpful to you. Happy UTF&#8211;8 hacking!</p>
</body>
</html>