Sunday, October 29, 2017

Update on Mplayer 1.3.0

Configuration


New configuration:
./configure --yasm='' --target=i486-Linux --disable-ftp --disable-tv-v4l1 --disable-tv --disable-qtx --disable-real --disable-win32dll --disable-gif --disable-mmx   --disable-3dnow   --disable-sse --disable-sse2 --disable-sse3 --disable-fastmemcpy --disable-xanim --disable-vm --disable-vesa --disable-svga --disable-xv  --disable-jpeg --disable-gl --disable-mga --disable-sdl --disable-fbdev  --disable-caca --disable-aa --disable-dga1 --disable-dga2 --disable-tga --disable-md5sum --disable-pnm --disable-mng --disable-libvorbis --disable-esd --enable-libmpeg2-internal --enable-libmpeg2 --disable-decoder=mlp --disable-mencoder --disable-freetype --disable-ass --disable-postproc --disable-unrarexec --disable-vidix --disable-vidix-pcidb   --disable-encoder=mlp  --disable-parser=mlp  --disable-protocol=mlp  --disable-demuxer=mlp  --disable-muxer=mlp --disable-yuv4mpeg     --disable-cmov  --disable-fast-clz --disable-fast-cmov --disable-mtrr
Manually disabled xmm clobbers inf configuration.
MPEG12_POSTPROC set to 0


Profiling

--enable-profile will not work if HAVE_INLINE_ASM is defined 1. Fails on cabarc.c Complains about missing reg (ebx ?).
No link time -pg option added. Did this manually. Results below.

Tested with sample mp4 file:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 17.98     11.11    11.11  3754368     0.00     0.00  idctSparseColPut_8
 16.64     21.39    10.28     4880     2.11     2.11  yuv2rgb_c_16_ordered_dither
  5.21     24.61     3.22  4386304     0.00     0.00  idctRowCondDC_8
  4.79     27.57     2.96   631936     0.00     0.00  idctSparseColAdd_8
  4.78     30.52     2.95   247680     0.01     0.13  ff_mpv_decode_mb
  3.93     32.95     2.43   247680     0.01     0.04  mpeg4_decode_mb
  3.25     34.96     2.01   956568     0.00     0.01  mpeg4_decode_block
  2.28     36.37     1.41   469296     0.00     0.00  dct_unquantize_h263_intra_c
  2.27     37.77     1.40   469296     0.00     0.00  ff_mpeg4_pred_ac
  2.14     39.09     1.32   469296     0.00     0.03  ff_simple_idct_put_8
  2.04     40.35     1.26   469296     0.00     0.00  ff_mpeg4_pred_dc
  2.01     41.59     1.24   183189     0.01     0.04  mpeg_motion
  1.85     42.73     1.14   234889     0.00     0.00  put_pixels8_8_c
  1.62     43.73     1.00    67500     0.01     0.01  put_pixels16_8_c
  1.46     44.63     0.90   469296     0.00     0.00  mpeg4_decode_dc
  1.44     45.52     0.89    52464     0.02     0.02  put_no_rnd_pixels8_xy2_8_c
  1.36     46.36     0.84     1376     0.61    39.72  decode_slice
  1.34     47.19     0.83   188391     0.00     0.04  ff_mpv_motion
  1.33     48.01     0.82    45901     0.02     0.02  put_pixels8_xy2_8_c
  1.32     48.83     0.82    49058     0.02     0.02  put_no_rnd_pixels16_8_c
  1.07     49.49     0.66   201780     0.00     0.00  ff_h263_update_motion_val
  0.96     50.08     0.60   161800     0.00     0.00  ff_h263_decode_motion
  0.79     50.57     0.49    53937     0.01     0.01  ff_clean_intra_table_entries
  0.76     51.04     0.47   247680     0.00     0.00  mpeg4_is_resync
  0.60     51.41     0.37    13688     0.03     0.03  avg_pixels16_8_c
  0.58     51.77     0.36     1105     0.33     0.33  copy
  0.57     52.12     0.35     7020     0.05     0.06  mp_msg_va
  0.49     52.43     0.31    33014     0.01     0.01  put_pixels8_l2_8
  0.47     52.72     0.29    78992     0.00     0.05  ff_simple_idct_add_8
  0.42     52.98     0.26  1016784     0.00     0.00  add_dct
  0.42     53.24     0.26    13440     0.02     0.02  ff_init_scantable
  0.40     53.49     0.25    33188     0.01     0.01  put_no_rnd_pixels8_l2_8
  0.40     53.74     0.25     3359     0.07     0.18  decode_vop_header

Mplayer 1.3.0 and new dev PC

New dev pc


My pentium 3 had some problems with ext2 filesystem. fsck was forced and it made kernel into panic mode. Used Linux Mint 12 cd to fix this but on next reboot same story. Shit happens. So filesystem was dead.
Installed Slackware 10.1 to virtualbox vm. Recompiled kernel, mpg123 and x86free for s3 (need to make post about this).

Configuration of Mplayer


In last post i used Mplayer version 0.90. It is old :D. So i downloaded version 1.3.0.
I used command line to configure it:
./configure --yasm='' --target=i486-Linux --disable-ftp --disable-tv-v4l1 --disable-tv --disable-qtx --disable-real --disable-win32dll --disable-gif --disable-mmx   --disable-3dnow   --disable-sse --disable-sse2 --disable-sse3 --disable-fastmemcpy --disable-xanim --disable-vm --disable-vesa --disable-svga --disable-xv  --disable-jpeg --disable-gl --disable-mga --disable-sdl --disable-fbdev  --disable-caca --disable-aa --disable-dga1 --disable-dga2 --disable-tga --disable-md5sum --disable-pnm --disable-mng --disable-libvorbis --disable-esd --enable-libmpeg2-internal --enable-libmpeg2 --disable-decoder=mlp --disable-mencoder --disable-freetype --disable-ass --disable-postproc --disable-unrarexec --disable-vidix --disable-vidix-pcidb   --disable-encoder=mlp  --disable-parser=mlp  --disable-protocol=mlp  --disable-demuxer=mlp  --disable-muxer=mlp
  
This is all well and good, but there are problems on config.h and other palces:

  • cmov enabled
  • i686 enabled 
  • gcc native enabled (i have old compiler so it will fail, atomic)
  • make is version 3.80 (needs version 3.82) or there will be error make: *** virtual memory exhausted. Stop.  (http://ppcluddite.blogspot.com/2011/11/) (https://ftp.gnu.org/gnu/make/)
  • atomic.c is not compiled (needs manual makefile change)
  • need to disable hwaccel drivers in allcodes.c
  • unsupported -Werror=format-security option in configure for cc. Configuration fails.


If all these changes are made, it compiles and works.
Compiled file  v0.90 2.5MB vs 12MB v1.3.0

At least it plays videos. :D

To test i made some sample files using Shotcut. mp4 file mpeg4 codec QVBR 41% mp3 16000Hz 16kb/s.

./mplayer -af channels=1 -framedrop testfile.mp4



Sunday, October 22, 2017

Mplayer 0.90

Play some videos

So it is time to test how well this 486 will play mpeg videos. I downloaded oldest version of Mplayer, configured with minimal settings (disabled all that is not needed). Compiled and tested out how well it runs.

First video is without sound.

Second video has sound, reduced to one channel only.


So it can somewhat play, but any larger video then last on here will not play with sound. With -framedrop option it shows black screen and plays audio. Without sound it really slow.

So time to investigate how can i speed this up.
Bad:
  • No hardware yuv
  • No SIMD ( :D)
  • No hardware sound accl
Good
  • Hardware BITBLT

Without profiling mplayer it is clear that mpeg idct takes most of the time.  

static inline void idct_row (int16_t * block)
{
    int x0, x1, x2, x3, x4, x5, x6, x7, x8;
    x1 = block[4] << 11;
    x2 = block[6];
    x3 = block[2];
    x4 = block[1];
    x5 = block[7];
    x6 = block[5];
    x7 = block[3];
    /* shortcut */
    if (! (x1 | x2 | x3 | x4 | x5 | x6 | x7 )) {
block[0] = block[1] = block[2] = block[3] = block[4] =
    block[5] = block[6] = block[7] = block[0]<<3;
return;
    }
    x0 = (block[0] << 11) + 128; /* for proper rounding in the fourth stage */
    /* first stage */
    x8 = W7 * (x4 + x5);
    x4 = x8 + (W1 - W7) * x4;
    x5 = x8 - (W1 + W7) * x5;
    x8 = W3 * (x6 + x7);
    x6 = x8 - (W3 - W5) * x6;
    x7 = x8 - (W3 + W5) * x7;

    /* second stage */
    x8 = x0 + x1;
    x0 -= x1;
    x1 = W6 * (x3 + x2);
    x2 = x1 - (W2 + W6) * x2;
    x3 = x1 + (W2 - W6) * x3;
    x1 = x4 + x6;
    x4 -= x6;
    x6 = x5 + x7;
    x5 -= x7;

    /* third stage */
    x7 = x8 + x3;
    x8 -= x3;
    x3 = x0 + x2;
    x0 -= x2;
    x2 = (181 * (x4 + x5) + 128) >> 8;
    x4 = (181 * (x4 - x5) + 128) >> 8;

    /* fourth stage */
    block[0] = (x7 + x1) >> 8;
    block[1] = (x3 + x2) >> 8;
    block[2] = (x0 + x4) >> 8;
    block[3] = (x8 + x6) >> 8;
    block[4] = (x8 - x6) >> 8;
    block[5] = (x0 - x4) >> 8;
    block[6] = (x3 - x2) >> 8;
    block[7] = (x7 - x1) >> 8;
}
So, how to speed this up even more. I did search on internet and it cant be made any faster, at least on 486.
Sure, there must be something that can be done.
I found this (http://jpegclub.org/jidctred/), uses 4x4 or 1x1 transform on 8x8 block.

Can we do this with mpeg 8x8 image block? If DCT coefficients are on left top corner we can reduce transform complexity by setting x1, x2, x4, x7 to zero. If we compile with -O3 option we get faster code, as gcc optimizes out some code. Do the same with column filter. Yes, it will reduce video quality.

Success. It plays better. 

Tested with only 2x2 block, but quality of the video was not really good. With the last video. It was blocky. First video played smoothly.

Another problem is gcc compiled code for 486. According to AMD documents and Intels there cpus have pipeline, and gcc compiled code with -O3 is not very good. There are probably AGI stalls in lots of places. (https://www.gamedev.net/articles/programming/general-and-gameplay-programming/optimize-386486pentium-code-r206/) Every clock tick counts.
If you compile for -march=i486 or i386 you get identical code.
So i tested with i586. Code is rearranged differently. Made test program with this idct and with i586 it was better. 18,8 sec vs 18,06 sec.

Last thing is to profile. And result shows that another cpu eater is yuv2rgb_, maybe newer version of mplayer has better conversion?

Sound

Mplayer uses mpg123 for mpeg audio decoding. It uses old version, 0.50. I need to replace this.


Tuesday, October 17, 2017

Linux Kernel 2.2.x or not?

Compile newer version of Linux Kernel

So BasciLinux 2 kernel is version 2.2.16. I wanted to compile newer kernel on another machine. So i completely reinstalled my Pentium III PC, it had Debian based Linux Mint 12 or so. Used slackware-10.1 iso for that. 
First i tested version 2.4.29. It compiled fine. Small copy on to 486 and it will not boot. IDE drivers were missing. Fixed that and it boots.
Next thing was getting it to connect to LAN. Wasted lot of time and realized that it has different structure. I had them built into kernel. Nothing happens. Then made module version. Kernel try's  to load modules with modprobe. Well this is not going to work. So i dropped version 2.4.29. 

Downloaded source of Kernel 2.2.21 to this pentium PC. It wont compile. gcc version was 3.3.5. Complains about lot of stuff:

  • conflicting types for `xtime' (https://www.linux-mips.org/archives/linux-mips/2001-08/msg00014.html)
  • error: long, short, signed or unsigned used invalidly for `slot_tablelen'  (https://www.linux-mips.org/archives/linux-mips/2001-08/msg00014.html)
  • '##'  (https://gcc.gnu.org/ml/gcc-bugs/2000-08/msg00723.html)
  • __asm__  (https://lwn.net/Articles/66793/) quotes are missing in arch i386
  • warning: pasting "(" and "0x0" does not give a valid preprocessing token { (https://forums.anandtech.com/threads/help-linux-kernel-compile-issues.711957/)
  • etc
Finaly managed to compile this kernel. Swapped out my 2.4.29 bzImage. 
Result: Kernel panic
Unable to handle kernel NULL pointer dereference at virtual address 00000f2c 
(https://www.linuxquestions.org/questions/linux-software-2/kernel-panic-swapper-not-syncing-
227208/)
Removed all modules, only blank kernel. Still panic. Gave up. Time wasted again.

Google'ing around the web i noticed another cool-looking distribution: http://delicate-linux.net/

Upgrading to Slackware 10.1 

So i was thinking, maybe i still can use kernel 2.4.29. 486 has CD-ROM, but it cant boot from it. Slackware 10.1 ISO image has ramfs, so i made copy to my dos partition where i boot my kernel with loadlin.exe into ramfs. (Had to increase ramfs size in kernel, another recompilation of many). So after booting from dos i mounted CD-ROM drive and started to install Slackware 10.1 onto my drive. Selected packages i needed. Probably to much. It took forever to install. 2 hours i think.
I had all ready compiled kernel  2.4.29 for 486. I booted again after install was done. It booted to freshly installed hard drive. After some more tweaking with this kernel i was finally happy, it worked.
One thing that was problematic was memory usage. BasicLinux 2 had only small files, no services, etc. 
Window manager fails to start. Could not figure out why. Finally decided to use BasicLinux X Server with icewm. Made some changes in paths in /lib and /usr. I got may GUI. Most of the programs complain about missing lib files. Sure, i dropped newer X Server.

Whanted to compile X Server 6.3 for it. After succesful compilation transfered files to 486.  XF86Config was problematic. X Server complained about Mode is missing. I did not give up. And after some changes in there managed to boot freshly compiled X Server.

Networking

So i have two machines with same version of Slackware. I can use putty and winscp in my main box to get closer over network. It made a lot easier to copy compiled files from Pentium machine over to 486 thru my main box :D.

Sound

Well, it took a wile to get it working. For some reason alsaconf wanted to insmod modules.  I had my sound card compiled into kernel. It did not work for alsaconf.
Removed all sound modules from kernel (recompiled again) and it looked like it works. At least alsamixer started working.
I needed some program that can output some sound. I remembered that in package selection phase i selected cli mp3 player. Namely mpg123. 
Started up this player. First second of the song was ok. But after that i got:
ALSA: underrun, at least 0ms

CPU usage was 99%. There were no music, only noise.

This cant be happening.
So looked around:
http://www.tldp.org/HOWTO/MP3-HOWTO-7.html

In: http://www.mpg123.org/
Plays Layer 3 in stereo on an AMD-486-120Mhz or (of course) a faster machine.

mpg123 version was 0.54. Old. Downloaded latest source, compiled and it hang 486. Really?
I have old distro with old libs. Probably this is reason. It gives a warning if you compile with i486 target :D. Finally used version 1.22.4 with small tweak in alsa module. Reverted on function and it compiled and worked fine.
Command line was:
./configure  --with-optimization=3 --with-cpu=i486  --enable-int-quality=yes  --enable-buffer=yes CFLAGS='-march=i486' --enable-modules=no

 With option -2 CPU usage was 27-36%. Success.

Well, i have internet connection in this old PC. Why not lissen to some online stream. Used (https://www.kexp.org/help/ListenLiveLinks) MP3 32k (Standard, Low Bandwidth).
Made bash file:
#!/bash/sh
./mpg123 http://live-mp3-128.kexp.org:80/kexp128.mp3

I have now radio :D


Saturday, October 14, 2017

Problems with sound

I noticed when i boot from Windows 95 to DOS i get error from EMM386 with smoething like:
EMM386 has detected error #
PC hangs and i need to turn it off.
First it did not bother me.  Problem was real when i booted with F8 option to DOS and had no sound. Wanted to see how well Quake 1 will run on this computer, but without sound its pointless.
To get sound working i needed EMM386 memory manager to work.
Every time sound driver was running DOS hang after running any program after that with above error by the memory manager.
So i had another PCI sound card. Downloaded DOS drivers and same problem. Replaced EMM385 and himem.sys with clean version, but problem remains. Used third PCI sound card, again same problem. No PCI sound card can run without EMM386 memory manager. At least with my sound cards.
Problems that my be the cause:

  • Faulty main memory
  • Wrong BIOS settings.
I still wanted to see how well Quake 1 will run. In Windows 95 with sound it has only 8MB free memory to use. It was really slow.

Final attempt. Used ISA ESS SoundDrive ES1868F sound card. Pulled this from another PC, installed and configured autoexec.bat
Quake 1 showed sound card and sound was there. It ran quite well. At least for 320x200 resolution.

Thursday, October 12, 2017

L2 Cache

Speedup

I wanted to get more speed out of L2 cache. Memory settings are automaticaly configured by BIOS.
Baseline settings


Changed BIOS to manual and tested with speedsys.

Cache/Memory Benchmark                

                     Read          Write         Move         Average  
 Cache Level 1    113.16 MB/s    38.29 MB/s   143.89 MB/s    98.45 MB/s 
 Cache Level 2     61.30 MB/s    38.05 MB/s    32.25 MB/s    43.87 MB/s 
 Memory            26.51 MB/s    38.65 MB/s    11.35 MB/s    25.50 MB/s 

New settings in BIOS

512 KB L2 cache

Bought some new UM61512AK-15 L2 chips so i can upgrade to 512KB of L2 cache. I all ready had 4.
Installed and changed some jumpers on motherboard.
512 KB of L2 cache

Friday, October 6, 2017

Running Linux

Downloading files and installing Linux

There are so many Linux distributions that can be used on this old PC. I searched and looked at requirements that came up.
So list of i considered:

  • Tiny Core Linux
  • Puppy Linux
  • DeLi Linux
  • BasicLinux
  • Damn Small Linux
So first i downloaded DSL (Damn Small Linux). It has nice boot and great selection of settings. But it was really slow. So i was thinking maybe Tiny Core Linux. Not good, minimum RAM is to high. Finally i came across BasicLinux. I used version 2.1 http://www.nomdo.nl/linux.htm.
I fallowed instructions to boot from DOS drive. It booted up fine.
So i set up the hard drive and downloaded all files needed to boot it from disk. The, when all was OK, i added X Server. It worked.
Standard setup has only video server for 640x480x4 color mode. No hardware acceleration. Downloaded S3 X Server from Slackware mirror. Used Slackware 7.1 version for that. After some setup in configuration files i wrote in console:
startx
And i got my GUI running in 800x600x16 mode. I can use 24 bit color mode, but resolution is lower.

Opera

Final thing was adding Opera 8.54. Could not find it anywhere. After heavy searching finally found version that was needed. Installed it and started surfing.
Current blog
Maybe newer version of Opera works better.
I did not bother installing OpenOffice 1.1.5.

Browsing 

Browsing is problematic with https, some pages do not load.

Facebook works fine.
https://mbasic.facebook.com/


Thursday, October 5, 2017

Benchmarking 486

Benchmarking

To test out this 486 PC i used speedsys tool.

Here are the results.
First HDD and PCI/ISA info.
 CPU  external L2 Cache is in Write-back mode.
CPU info and second HDDinfo.

Cache/Memory Benchmark

                 
                     Read          Write         Move         Average   
 Cache Level 1    111.70 MB/s    38.27 MB/s   139.63 MB/s    96.53 MB/s 
 Cache Level 2     43.80 MB/s    38.02 MB/s    23.57 MB/s    35.13 MB/s 
 Memory            26.49 MB/s    38.34 MB/s     9.73 MB/s    24.86 MB/s 


Hard drive Benchmark

Hard drive 0:     527C 32H 63S 518 MB   ST3660A
                  Tested in FAST mode
                  Average/Max seek time  : 1.29 ms / 1.36 ms
                  Random seek time       : 1.34 ms
                  Track-to-track seek    : 1.22 ms
                  Random access time     : 24.83 ms
                  Linear verify speed    : 1895 KB/s
                  Min/Max verify speed   : 1248 KB/s / 2197 KB/s
                  Linear read speed      : 1199 KB/s
                  Min/Max read speed     : 1193 KB/s / 1215 KB/s
                  Buffered read speed    : 1235 KB/s

                  Hard Drive speed index : 26.45

Hard drive 1:     784C 128H 63S   3.01 GB FUJITSU MPB3032ATU
                  Tested in FULL mode
                  Average/Max seek time  : 10.73 ms / 17.37 ms
                  Random seek time       : 9.47 ms
                  Track-to-track seek    : 3.03 ms
                  Random access time     : 16.88 ms
                  Linear verify speed    : 8309 KB/s
                  Min/Max verify speed   : 5696 KB/s / 10334 KB/s
                  Linear read speed      : 1395 KB/s
                  Min/Max read speed     : 1395 KB/s / 1396 KB/s
                  Buffered read speed    : 1401 KB/s

                  Hard Drive speed index : 170.65


Video card Benchmark


Used tools pack from http://www.philscomputerlab.com/dos-benchmark-pack.html
For comparison see: 





To compare this PC video card S3 Trio with the above results:
  • Chris 3D bench score: 74,1
  • Chris 3D bench SVGA score: 22,5



Sunday, October 1, 2017

Hardware configuration

Setting up this old PC is nothing special.
I used th99 archive to get spec for motherboard. But there are some differences as some markings and jumper locations are in different places.

I clocked CPU to 120MHz. It has 8KB of L1 cache.
Replaced L2 cache chips with UM61256AK-15. There are four 32K x 8 + one 32K x 8 as TAG. Total 128KB of L2 cache. Maximum can be 1MB, but i don't have chips. BIOS setting is WB.

First HDD all ready had Windows 95 code name Chicago installed. It is first version 4.00.950. See wikipedia.

I added second HDD from my collection FUJITSU MPB3032ATU.

Video card is PCI S3 Trio32/64 86C732 with 1MB of RAM.
Also PCI sound card Creative 5880. Not tested currently.
Motherboard has on free slot, witch can be filled with PCI USB board i have (VT6212L).

Saturday, September 30, 2017

486 Computer

So i have computer with 486 CPU, 16 MB RAM.
I am planning to use Windows 95/Slackware Linux OS on this PC.
Also surfing on the web and doing some word processing and multimedia. Seems impossible with today's standards.

To be fair, i have all ready done some of these things. But its better to document all of this.

Specs for the PC