New York, New York, United States


3D Mind will cluster your compounds for you.
and it can also show you what else are in that cluster.

go into 3D Mind tools and use node 23+18 (cluster 16.8 + 17.9)
or node 13+11+7+3+...

or you can search by SMILE string: "C1CC3=C(CO1)CN2CC5=C(C2=C3)N=C4C=CC=CC4=C5" here

here's a paper uses NCI's anticancer database to do QSAR

found a pretty good entry to find cancer screen data from nci
for aids screen data get compouds here then get activity here or see structures here

what is GI50 TGI LC50 in nci database

you can use online tools: NCI 3DMiner or Enhanced NCI Database Browser to search compounds list here, then you can easily compare those compounds.

after you saw some substructure shared by all the structures above,
and got more structures by substructure search.
you can use this page to see there activies.

Ward in JKlustor can work with GenerateMD then you can Clustering using Pharmacophore Fingerprints.
Ref: example section of this page


you can use Jarp and Ward in JKlustor
to cluster the compounds based on NCI's data
ref: http://www.chemaxon.com/conf/Eurocombi_poster_Ltr.pdf

there's a freeware like ghost
called g4u (ghost for unix)
works in a floppy disk and can upload harddisk image to a ftp server. and dump to another computer's harddisk.

I think I can find a drug for AIDS here.(example)

and it will gives me several similar compounds and CAS#

then I can use those CAS# to search activity data here

for Cancer, use these Drugs list1 Drug List 2 and Acitivity

PS: if interested in anti-cancer plants, you can find it and compounts it produced here or here.


Oops, I WAS planing to do an automated QSAR program for binf7592
but after I read Ch.5.3.3 in the paper "Structure Database" teacher gave us on 10/20.
I noticed there are several tools exist. "Catalyst" and "APEX"
details about catalyst and apex

drug activity data can be obtained from here for binf7592 project

other activity database for future job can be obtained here

we know SMILES is not Unique.
but it's very easy to Translate to unique smiles.


Grail has socket interface
ref original
detail (official)

use meme to find motifs
./meme ../../markfilter/shrtincor.fa -protein -mod zoops -nmotifs 20 -minsites 2 -maxsites 5 -minw 3 -maxw 50 -evt 10000 -time 7200 -maxsize 60000 -nostatus -maxiter 25 > output.html

use meta-meme to make hmm
./mhmm -meme ../../../meme.3.0.4/bin/output.html > test.mhmm

test sequences using that hmm
./mhmmscan -hmm test.mhmm -seq ../../../markfilter/shrtcor.fa 2>/dev/null | more

META-MEME seems good for my HMM engine,
but after some research, maybe MEME will sute me more.
it can find motifs for me in the sequences.
and of course, if you want HMM, you are one button away!

seems I can use it on super computer, but still slow...


I think I might use HMMER or SAM or HTK or GHMM as my HMM engine.
and train the model by using both positive exon-intron group and nagtive exon-intron group.

web interface for:
convert between HMMER and SAM.

using netcat to scan your computer ports:
nc -v -z 1-9999
will scan port 1~port 9999 for you without delay.


Cambridge is pretty easy to use
short tips:
on octane-2,
just cd to "/scratch/molmod/cambridge/cambridge/bin"
and type "cq"
then, you got it!

pretty easy to use, but not much records in database yet.

quota on /research only 5MB.

you can use this program to see if the association is come from HSSEX or really from HFE7*HSSEX;

proc glm data=adult;
where HAD1<3;
class HFE7 HSSEX;
lsmeans HFE7 HSSEX HFE7*HSSEX/pdiff;

at the end of following page, you'll see how to use SAS to analyze multiple variables.

ex: I know HFE7 associat with HAC1E, but will it be different between man and woman(HSSEX)?

After 4 days trying, I finally know how to use SAS to figure out which variable is associate with specific variable.

you can use following SAS program to figure out which variable between HAC1A~HAC1O
are associate with HFE7;
proc glm data=adult;

after you find something's P<0.05 (ex:HAC1E)

you can use following SAS program to see
what's the different between group 1 and 2 in HFE7
proc glm data=adult;
class HFE7;
model HAC1E=HFE7;
lsmeans HFE7;

this helps


For scan your network holes:

Leak Testers:

LeakTest - http://grc.com/lt/leaktest.htm
TooLeaky - http://tooleaky.zensoft.com
Firehole - http://keir.net/firehole.html
Yalta - http://www.soft4ever.com/security_test/En/index.htm

Online Port Scanners:

Sygate – Option in software menu.
Symantec - http://www.symantec.com/cgi-bin/securitycheck.cgi
PC Flank - http://www.pcflank.com/index.htm


when searching on pubmed, don't use ":" in query.
if the title of paper has ":" in it, remove it when searching.
or you'll never get result.

Augustus got a good thing about start_codon and stop_codon
and can detect short exons.
and can download and run locally.


I previously misunderstood this site (http://www.fruitfly.org/seq_tools/splice.html)
it shows the donor site's start point and end point but not the cut position.
after this understanding, it can be used in binf7600's project.

I think I can find Alternative Splicing Sequences and data from ProSplicer,
to prove my model,
then train my model.

today NetGene2 not work, so I try to find others.

maybe try if "exonic splicing enhancers predictor" can help improve ours or not.
ref: http://nar.oupjournals.org/cgi/reprint/31/13/3568.pdf

this site do the same thing as me, connect to other sites to obtain interstat values.
ref: http://www.inra.fr/bia/T/schiex/Export/LNCS-EuGene.pdf

this software is a A Generic Framework for the Integration of Gene-Prediction Data
ref: http://www.genome.org/cgi/reprint/12/9/1418.pdf


here even has a database for my whole theory.
this one looks better

you can see when it cut in different places, and didn't get correct, what will happen.

figure 4 in this paper proved my third theory(10/8/03).

I found a scary thing in my special Linux cd,
although I know that CD is a very powerful cd,
I don't know it's that scary...

it not only has ethereal, tcpdump,
it also has "ettercap"!
a tool which can hack even in switch env, ssh, OS fingerprint, disconnect connections, type charactors for others!

here's a very very good introduction of this kind of tech and tools. and ways to prevent!


I got a material totally support my theory(10/8/03), and they also got an interesting paper talking about differences in RNA splicing between normal cell and aged cell.

I think this group are going the same direction as me.

You can use Blogger API to extract your blog from blogger.com in XML format
tested using WSDL and SoapClient web interface


I found part of the answer why hnRNP needed in RNA splicing (to cut intro exon).
Because hnRNP A2 on RNA will use kinesin on microtubule to move it self. (supported by this paper)
and this paper also support my theory
so it can move two splicesome together (my theory).

this paper support my another theory, "a novel class of genes which, although they encode polyadenylated RNA, might not make a translated protein."

this abstract also support my third theory, when enzyme see the sequence, they don't know if they cut the right place or not, they just cut it. There's other place to check if they cut right or not.

and I'll use these three theory in my binf7600 project.

and this one should be able to help me to detect hnRNP binding site.

(I found this on 10/10/03, which draw exactly as I thought! and support all my theory!)

bioinfo tutorial in flash animation (chinese)

you can search similar structures or substructures using
and AIDS, Cancer activity also. (not working on 10/8/03)

Since I can't use SAS to do analyze at PC Lab, due to Go Back and hard disk limit in PC Lab, I decide to do it at home.
Since I don't have SAS and it cost $50 for student version and only work for 3 months, I decide to use something free.
So I found OpenStat, a freeware which can do lots of statistics analysis.
It has both windows and Linux version.
Now I use SAS at school to translate data to CVS format, but I think I should write a translator to do that for rapidly use.


for binf7592
I like ChemSketch, cause it can check the structure for me easily, both in 2D and 3D.
but if I need to create something like Markush Structure, I think I'll use ChemSketch first then use ISIS Draw to make it like Markush Structure.

after one day testing and researching,
I found the answer of my problems,

VMWare's NIC doesn't support PXE.
ClusterKnoppix 9/5/03 version has "terminalserver + etherboot" bug
should be fixed in 9/24/03 version

Windows on CD, Linux on CD (intro in chinese)
knoppix pxe, how to configure debian manually, how to make boot disk. (in chinese)
Diskless Remote Boot in Linux (DRBL) for Redhat 8.0 (tutorial in chinese)
knoppix pxe faq

bochs a virtual pc software like vmware but it's opensource.


Bot for MSN chat room
how to use response.txt?
Remember to restart the ViperBot after make changes as discribed above.
Tested in W2K english version, works fine.
But failed in Chinese version.

What is Markush Structure? (Chinese)
How to search pharmaceutical compounds in patent database using Markush Structure search.

more info


How to make an NCBI Blast at home on clusters with OpenMosix support?

Booting Windows From CD-ROM (windows on cd)


today, I saw something that I want to create in the past.
a cluster system which requires no installation, no harddisk.
only a cdrom pop into PCs, then it works as one clustered system.
so it can easily work in pc labs in school. :p

It called clusterknoppix

and it's better!
client nodes doesn't even need CDROM! (when boot from PXE chips [rom])
or a floppy disk